Dr. Rakesh Agrawal

 

Activities
Papers
Citations
Patents
Press
Rakesh Agrawal
Member, National Academy of Engineering
ACM Fellow, IEEE Fellow
Ex-Microsoft Technical Fellow, Ex-IBM Fellow
ragrawal@acm.org

Rakesh Agrawal is an innovator and thought leader who is driven by the desire to change the world by making scientific breakthroughs and by building practical working systems. He is the recipient of the ACM-SIGKDD Inaugural Innovation Award, ACM-SIGMOD Edgar F. Codd Innovations Award, ACM-SIGMOD Test of Time Award (twice), VLDB 10-Yr Most Influential Paper Award, ICDE Most Influential Paper Award, and the Computerworld First Horizon Award. Scientific American named him to the list of 50 top scientists and technologists in 2003. He is a Member of the National Academy of Engineering, a Fellow of ACM, and a Fellow of IEEE.

Until recently, Rakesh was a Microsoft Technical Fellow and headed the Search Labs in Microsoft Research. Prior to joining Microsoft in March 2006, Rakesh was an IBM Fellow and led the Quest group at the IBM Almaden Research Center. Earlier, he was with the Bell Laboratories, Murray Hill from 1983 to 1989. He also worked for three years at India's premier company, the Bharat Heavy Electricals Ltd. He received the M.S. and Ph.D. degrees in Computer Science from the University of Wisconsin-Madison in 1983. He also holds a B.E. degree in Electronics and Communication Engineering from IIT-Roorkee, and a two-year Post Graduate Diploma in Industrial Engineering from the National Institute of Industrial Engineering (NITIE), Bombay. Both IIT-Roorkee and NITIE have decorated him with their distinguished alum award.

Rakesh has been granted more than 75 patents. He has published more than 200 research papers, many of them considered seminal. He has written the 1st as well as 2nd highest cited of all papers in the fields of databases and data mining (18th and 26th most cited across all computer science). Wikipedia lists one of his papers as one of the most influential database papers. His papers have been cited more than 80,0000 times, with more than 25 of them receiving more than 500 citations each and three of them receiving 5000 citations each (Google Scholar). He is the most cited author in the field of database systems and the 26th most cited author across all of Computer Science (Citeseer). His work has been featured in New York Times Year in Review, New York Times Science section, and several other venues. He has founded and served in key positions in international societies and has provided leadership in important professional activities.

Rakesh is being increasingly sought to help with studies on topics of national and international interest. He played a key role in 2005 IBM's study for President of India on Improving India's Education System through Information Technology. In 2008, he was a key member of the National Academy of Sciences study on the Interoperability of Voter Registration Databases. Then, he was the only Computer Scientist in the 2010 National Research Council study on Science and Technology strategies of key countries world-wide. Recently, Rakesh is in dialog with Govt. of India to use his technology for enriching textbooks used by millions of students.

Key Technical Accomplishments

With the unprecedented growth-rate at which data is being collected and stored electronically in almost all fields of human endeavor, the efficient and responsible extraction of useful information from data has become a crucial scientific challenge and a critical economic need. In the early 1990's, Rakesh and his team began devising algorithms for asking open-ended queries, eventually authoring a 1993 paper on association rule discovery that later became the foundational paper for the field of knowledge discovery and data mining. Association rule discovery is a data mining approach to find unexpected patterns in large data sets. Rakesh is fondly referred to as the father of data mining because of this seminal work and other fundamental data mining concepts and technologies he devised. It is noteworthy that the ACM SIG on Knowledge Discovery and Data Mining awarded its inaugural Innovations Award for outstanding technical contributions to the field to Rakesh. Four of his papers on data mining have received "test-of-time" awards: two from SIGMOD and one each from VLDB and ICDE.

It is rare that a researcher's work creates not only a product, but a whole new industry. IBM's data mining product, Intelligent Miner, grew straight out of Rakesh's research. IBM's introduction of Intelligent Miner and associated services created a new category of software and services. His research has been incorporated into many other commercial products, including DB2 Mining Extender, DB2 OLAP Server, WebSphere Commerce Server, and Microsoft Bing Search engine, as well as many research prototypes and applications.

Subsequently, Rakesh and his team pioneered key concepts in data privacy, including Hippocratic Database, Privacy-Preserving Data Mining, and Sovereign Information Sharing. In a series of papers, the Hippocratic database work laid out the principles, architecture, and technologies for a database system that included the responsibility for privacy of data as a founding tenet. The privacy-preserving data mining work invented techniques for building accurate decision models without accessing precise information in individual data records. The sovereign information sharing work allows autonomous entities to apply database operations across private databases in such a way that no information apart from the result is revealed. These bodies of work have gained increasing importance given the emergence of cloud computing and well-publicized inadvertent and intentional misuse of data collections. In recognition, SIGMOD 2014 selected one of his data privacy related papers to receive the test-of-time award.

Recently, Rakesh and colleagues have been furthering and applying data mining in a very innovative and novel way to enhance electronic textbooks and online education. This body of work has already provided technologies for algorithmically identifying deficient sections in a textbook, augmenting textbooks with rich content in multiple format mined from the Web, and forming study teams with the goal of maximizing overall learning. Given the criticality of high quality education for success in modern society, this body of work is destined to gain increasing importance.