Text and Data Mining for Material Synthesis
Invited
Abstract
Data has become a fundamental ingredient for accelerating and optimizing materials design and synthesis. Molecular synthesis planning, driven by advances in machine learning, has recently achieved human-level performance for the retrosynthetic design of organic molecules. The acceleration of data-driven synthesis planning and related analyses has, in part, been enabled by access to massive datasets which tabulate known chemical reactions. While macromolecule, polymer and inorganic materials databases also exist, the focus of these databases is primarily on materials structures and properties, rather than reactions and synthesis. Indeed, there is currently no comprehensive dataset which organizes the methods by which these materials are synthesized or even extensive property information. Comprehensively extracting the knowledge contained within written inorganic materials syntheses, without the use of significant human effort, is a key step towards reducing the overall discovery and development time for novel materials. This presentation will describe work to extract information from peer reviewed academic literature across a range of inorganic solid state materials synthesis approaches. We have demonstrated not only the potential of the natural language processing (NLP) approach to assemble materials data from the literature, but we have also shown that one can develop hypotheses for what synthesis conditions drive a particular target material outcome using learning approaches.
–
Presenters
-
Elsa Olivetti
Massachusetts Institute of Technology
Authors
-
Elsa Olivetti
Massachusetts Institute of Technology
-
Edward Kim
Massachusetts Institute of Technology