Christopher Ré is a computer scientist democratizing big data analytics through theoretical advances in statistics and logic and groundbreaking data-processing applications for solving practical problems. Ré has leveraged his training in databases and deep knowledge of machine learning to create an inference engine, DeepDive, that can analyze data of a kind and at a scale that is beyond the current capabilities of traditional databases.
DeepDive analyzes “dark data”—or the mass of unprocessable data buried in texts, illustrations, images, etc.; it then extracts relationships among entities (i.e., real-world objects) in the data and infers facts involving those entities. These facts, or assertions, form a knowledge base, which can then be integrated into an existing database. DeepDive has proved to be more accurate than human annotation, and it can be “trained,” even by users without computer science expertise, to improve the quality of results through simple domain-specific rules and low-level feedback about correct or erroneous predictions. Ré is also working to overcome the technical challenge of computing the many possible interpretations of each data item with speed and efficiency. He has made significant advances in Incremental (Stochastic) Gradient Descent (IGD) and has prototyped Hogwild!, which enables data analysis on a multicore machine, and Bismarck, which integrates various analytical tasks into a traditional database system without the need for separate code paths.
These improvements have led to the application of DeepDive in a wide range of settings, from scientific laboratories to law enforcement. For example, the Pharmacogenomics Knowledgebase extracts data about the relationships among genes, diseases, and drugs from biomedical literature to create an understanding of interactions between genes and drugs in the body—information that is critical for drug discovery. A current DARPA program is using DeepDive to extract data about human trafficking networks from the dark web. Ré’s work across theory and practice, and commitment to creating open-source code that can be integrated into more inclusive architectures or applied systems, is revolutionizing our ability to make this new world of big data truly accessible and widely useful.
Christopher Ré received a B.S. (2001) from Cornell University and a Ph.D. (2009) from the University of Washington at Seattle. He was an assistant professor (2009–2013) at the University of Wisconsin at Madison before joining the faculty of Stanford University, where he is currently an assistant professor in the Department of Computer Science. His work has been recognized with best paper awards from PODS (Annual Symposium on Principles of Database Systems) and SIGMOD (Annual ACM International Conference on Management of Data) and has appeared in such proceedings and journals as SIGMOD, VLDB (Proceedings of the Annual Conference on Very Large Data Bases), and Neural Information Processing Systems (NIPS), among others.