A Bayesian Approach to Predict Solubility Parameters

B Sanchez-Lengeling and LM Roch and JD Perea and S Langner and CJ Brabec and A Aspuru-Guzik, ADVANCED THEORY AND SIMULATIONS, 2, 1800069 (2019).

DOI: 10.1002/adts.201800069

Solubility is a ubiquitous phenomenon in many aspects of material science. While solubility can be determined by considering the cohesive forces in a liquid via the Hansen solubility parameters (HSP), quantitative structure-property relationship models are often used for prediction, notably due to their low computational cost. Here, gpHSP, an interpretable and versatile probabilistic approach to determining HSP, is reported. Our model is based on Gaussian processes, a Bayesian machine learning approach that provides uncertainty bounds to prediction. gpHSP achieves its flexibility by leveraging a variety of input data, such as SMILES strings, COSMOtherm simulations, and quantum chemistry calculations. gpHSP is built on experimentally determined HSP, including a general solvents set aggregated from the literature, and a polymer set experimentally characterized by this group of authors. In all sets, a high degree of agreement is obtained, surpassing well- established machine learning methods. The general applicability of gpHSP to miscibility of organic semiconductors, drug compounds, and in general solvents is demonstrated, which can be further extended to other domains. gpHSP is a fast and accurate toolbox, which could be applied to molecular design for solution processing technologies.

Return to Publications page