Your browser doesn't support javascript.
loading
Accelerating Formulation Design via Machine Learning: Generating a High-throughput Shampoo Formulations Dataset.
Chitre, Aniket; Querimit, Robert C M; Rihm, Simon D; Karan, Dogancan; Zhu, Benchuan; Wang, Ke; Wang, Long; Hippalgaonkar, Kedar; Lapkin, Alexei A.
Afiliación
  • Chitre A; Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge, CB3 0AS, UK.
  • Querimit RCM; Cambridge Centre for Advanced Research and Education in Singapore, CARES Ltd. 1 CREATE Way, CREATE Tower #05-05, Singapore, 138602, Singapore.
  • Rihm SD; Institute of Materials Research and Engineering, Agency for Science, Technology and Research (A*STAR), Singapore, 138634, Singapore.
  • Karan D; Institute of Materials Research and Engineering, Agency for Science, Technology and Research (A*STAR), Singapore, 138634, Singapore.
  • Zhu B; School of Chemistry, Chemical Engineering and Biotechnology, Nanyang Technological University, Singapore, 637459, Singapore.
  • Wang K; Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge, CB3 0AS, UK.
  • Wang L; Cambridge Centre for Advanced Research and Education in Singapore, CARES Ltd. 1 CREATE Way, CREATE Tower #05-05, Singapore, 138602, Singapore.
  • Hippalgaonkar K; Cambridge Centre for Advanced Research and Education in Singapore, CARES Ltd. 1 CREATE Way, CREATE Tower #05-05, Singapore, 138602, Singapore.
  • Lapkin AA; BASF Advanced Chemicals Co. Ltd., No. 300, Jiang Xin Sha Road, Pudong, Shanghai, 200137, China.
Sci Data ; 11(1): 728, 2024 Jul 03.
Article en En | MEDLINE | ID: mdl-38961122
ABSTRACT
Liquid formulations are ubiquitous yet have lengthy product development cycles owing to the complex physical interactions between ingredients making it difficult to tune formulations to customer-defined property targets. Interpolative ML models can accelerate liquid formulations design but are typically trained on limited sets of ingredients and without any structural information, which limits their out-of-training predictive capacity. To address this challenge, we selected eighteen formulation ingredients covering a diverse chemical space to prepare an open experimental dataset for training ML models for rinse-off formulations development. The resulting design space has an over 50-fold increase in dimensionality compared to our previous work. Here, we present a dataset of 812 formulations, including 294 stable samples, which cover the entire design space, with phase stability, turbidity, and high-fidelity rheology measurements generated on our semi-automated, ML-driven liquid formulations workflow. Our dataset has the unique attribute of sample-specific uncertainty measurements to train predictive surrogate models.

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: Sci Data Año: 2024 Tipo del documento: Article País de afiliación: Reino Unido Pais de publicación: Reino Unido

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: Sci Data Año: 2024 Tipo del documento: Article País de afiliación: Reino Unido Pais de publicación: Reino Unido