Machine Learning and Big-Data in Computational Chemistry

Experimental chemistry and the younger discipline of computational chemistry have always aspired to increase data volume, velocity, and variety. The recent software developments in machine learning, databases and automation and hardware advances in fast co-processors, networking, and storage have boosted automation and digitization. Computational chemistry is seemingly on the verge of a big-data revolution.

In this chapter, we discuss how many of these data-driven paradigms are part of long-term trend and data have long been at the heart of many chemical problems. Historical repositories of chemical data where the modern cheminformatician can mine high value curated training data are reviewed. Modern automation tools and datasets available for high-data computational chemistry are described. Current applications of computer-driven discovery of molecular materials in optoelectronics (photovoltaics and light-emitting diodes) and electrical energy storage are discussed. Finally, the impact of machine learning approaches to computational chemistry areas of structure-property relationships and chemical space, with an emphasis on generative models, are analyzed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Similar content being viewed by others

Machine Learning and Big-Data in Computational Chemistry

Chapter © 2020

Machine learning in chemical reaction space

Article Open access 30 October 2020

Cheminformatics: At the Crossroad of Eras

Chapter © 2014

References

Acknowledgments

AAG acknowledges support from The Department of Energy, Office of Basic Energy Sciences under award de-sc0015959. He also thanks Dr. Anders Frøseth for his generous support of this work. RGB acknowledges the Toyota Career Development Chair for financial support.