Open-Source Machine Learning in Computational Chemistry

A Hagg and KN Kirschner, JOURNAL OF CHEMICAL INFORMATION AND MODELING, 63, 4505-4532 (2023).

DOI: 10.1021/acs.jcim.3c00643

The field of computational chemistry has seen a significantincreasein the integration of machine learning concepts and algorithms. Inthis Perspective, we surveyed 179 open-source software projects, withcorresponding peer-reviewed papers published within the last 5 years,to better understand the topics within the field being investigatedby machine learning approaches. For each project, we provide a shortdescription, the link to the code, the accompanying license type,and whether the training data and resulting models are made publiclyavailable. Based on those deposited in GitHub repositories, the mostpopular employed Python libraries are identified. We hope that thissurvey will serve as a resource to learn about machine learning orspecific architectures thereof by identifying accessible codes withaccompanying papers on a topic basis. To this end, we also includecomputational chemistry open-source software for generating trainingdata and fundamental Python libraries for machine learning. Basedon our observations and considering the three pillars of collaborativemachine learning work, open data, open source (code), and open models,we provide some suggestions to the community.

Return to Publications page