Data Mining for Material Synthesis
February 19, 2018
While most people think that
science operates in a totally
logical,
Mr. Spock, style, there's actually a lot of
intuition involved in successful
theory-building. I've found that I get my best
ideas when I'm in that
dreamlike state experienced before nodding-off to
sleep. Another time of idea-generation is when I'm trapped in my own thoughts without any distraction, as when I'm in a
waiting room or sitting quietly before the start of a
movie,
concert, or
church service.
I don't think I've ever solved a problem in an actual dream, a concept called
lucid dreaming. There are a few examples of lucid dreaming leading to some famous ideas in science,
technology, and
mathematics, but these are so few in number that I wouldn't spend too much time sleeping instead of working.
Of course, the most famous example of a dream-induced scientific idea is the discovery of the
ring structure of
benzene by
German chemist,
August Kekulé (1829-1896). By
his own account, the idea of the benzene ring structure came to Kekulé in a day-dream of the
ouroboros, the image of a
snake biting its own
tail.
An 1898 engraving of August Kekulé (1829-1896) from vol. 54 of Popular Science Monthly.
(Via Wikimedia Commons)
An example from technology is
Elias Howe's idea for an improved
sewing machine in 1845. Howe had been trying to design an effective sewing machine when, according to
family records, he had a dream that he was held captive by the
king of a strange country who demanded that he build a sewing machine. Threatened by
spear-carrying warriors, Howe noticed that their spears had holes near the pointed tip. Unlike a manual
sewing needle, which is
threaded at the end opposite to the point, this was realized to be the ideal place for threading a machine needle. Howe awoke, headed to his
workshop, and had a
prototype working in a few hours.
Fig. 2 of US Patent No. 4,750, "Improvement in sewing-machines," by Elias Howe, Jr., September 10, 1846.
As stated in the patent, "The needle used has the eye that is to receive the thread within a small distance - say, an eighth of an inch - of its inner or pointed end."
Via Google Patents.[1]
Mathematics had its dreamers, also, the prime example of which was
Ramanujan. Ramanujan said that a
Hindu goddess, Namakkal, presented
mathematical formulas in his dreams that he verified after waking. In this, Ramanujan follows in the tradition of the
ancient Greek philosophers who believed that some dreams carried messages from the
gods. Although
Aristotle (384-322 BC) treated dreams as a scientific
phenomenon in his short
essay,
On Dreams (Περι ενυπνιων, or
De insomniis),[2-4] the
dialogues of his mentor,
Plato (c.427 BC - c.347 BC) consider dreams to be messages from the gods.
The intuitions that come to scientist in day-dreams probably arise from
random connections between
memorized facts in
analogy to how
genetic algorithms function. The mind pieces together random ideas, subjects them to a "
fitness test," and selects the ones that make the most sense. When not dreaming, I would perform a similar process in
materials development, combining knowledge of known materials to generate new materials. Having handy recall of things such as
ionic sizes,
crystal structure,
material properties, and some
rules-of-thumb, made the process proceed more quickly.
In this age of
ubiquitous computing, it's easy to see how materials development can be
automated. That's what a team of
materials scientists
and
computer scientists from the
Massachusetts Institute of Technology (MIT, Cambridge, Massachusetts), the
University of Massachusetts Amherst (Amherst, Massachusetts), and the
University of California Berkeley (Berkeley, California) thought when they examined the use of text extraction from the
scientific literature and
machine learning to generate schemes for materials
synthesis. Their work is reported in a recent article in the
journal,
Chemistry of Materials.[5-6]
Kitchen chemistry.
Much chemistry involves production of known chemicals according to a recipe. That's the reason for the popular analogy between cooking and chemistry.
(MIT image by Chelsea Turner.)
Many
computational efforts have generated novel materials for
catalysis,
thermoelectrics, and other applications, so the
bottleneck becomes the synthesis of these materials.[5] The usual development of synthesis processes relied on the intuition and experience of the materials scientist, guided by a traditional literature search and review.[6] The present study uses
artificial intelligence techniques to
data mine tens of thousands of research papers to automatically deduce recipes for producing such novel materials using
natural language processing techniques.[5]
Says
Elsa Olivetti, a
professor in
MIT's Department of Materials Science and Engineering and
co-author of this study,
"Computational materials scientists have made a lot of progress in the 'what' to make - what material to design based on desired properties... but because of that success, the bottleneck has shifted to, 'Okay, now how do I make it?'"[6]
As a proposed end-product of this research program, there would be a
database of material recipes data mined from millions of journal articles. An underlying machine-learning system would use natural language processing to deduce materials recipes and synthesis parameters from these articles.[6] Suggested recipes for synthesis of a target material would result from entry of the target material's name and criteria such as proposed
precursor materials and
reaction conditions, among other parameters.[6]
As a demonstration of this approach, the research team examined the synthesis conditions for various
metal oxides from data mining more than twelve thousand articles and then predicted recipes for synthesis of
titania nanotubes via
hydrothermal synthesis. Both supervised and unsupervised machine-learning techniques were used, the supervised method involved
annotation by humans while the unsupervised method had the system learn how to organize the data.[6] Using an
algorithm called
Word2vec, which was developed at
Google, the researchers were able to train their system with about 640,000 papers.[6]
In an analysis of the system’s
accuracy, they found that it was able to identify paragraphs that contained recipes with 99% accuracy and label the words within those paragraphs with 86% accuracy.[6] The further objective of this research are to improve accuracy by using
deep learning techniques to automatically devise recipes for those materials not included in the existing scientific literature.[6] This research was
funded by the
National Science Foundation, the
Office of Naval Research, and the
Department of Energy, among other sources.[6]
References:
- Elias Howe, Jr., "Improvement in sewing-machines," US Patent No. 4,750, September 10, 1846.
- Aristotle, "On Dreams," J. I. Beare, Trans., The Internet Classics Archive by Daniel C. Stevenson.
- Aristotle, "On Dreams," J. I. Beare, Trans., The University of Adelaide Library.
- Aristotle, "On Dreams," Greek text via Wikisource.
- Edward Kim, Kevin Huang, Adam Saunders, Andrew McCallum, Gerbrand Ceder, and Elsa Olivetti, "Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning," Chem. Mater. (Article ASAP, October 19, 2017), DOI: 10.1021/acs.chemmater.7b03500.
- Larry Hardesty, "Artificial intelligence aids materials fabrication," MIT Press Release, November 5, 2017.