Your browser doesn't support javascript.
loading
Advancements in hand-drawn chemical structure recognition through an enhanced DECIMER architecture.
Rajan, Kohulan; Brinkhaus, Henning Otto; Zielesny, Achim; Steinbeck, Christoph.
Afiliación
  • Rajan K; Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessingstr. 8, 07743, Jena, Germany.
  • Brinkhaus HO; Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessingstr. 8, 07743, Jena, Germany.
  • Zielesny A; Institute for Bioinformatics and Chemoinformatics, Westphalian University of Applied Sciences, August-Schmidt-Ring 10, 45665, Recklinghausen, Germany.
  • Steinbeck C; Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessingstr. 8, 07743, Jena, Germany. christoph.steinbeck@uni-jena.de.
J Cheminform ; 16(1): 78, 2024 Jul 05.
Article en En | MEDLINE | ID: mdl-38970120
ABSTRACT
Accurate recognition of hand-drawn chemical structures is crucial for digitising hand-written chemical information in traditional laboratory notebooks or facilitating stylus-based structure entry on tablets or smartphones. However, the inherent variability in hand-drawn structures poses challenges for existing Optical Chemical Structure Recognition (OCSR) software. To address this, we present an enhanced Deep lEarning for Chemical ImagE Recognition (DECIMER) architecture that leverages a combination of Convolutional Neural Networks (CNNs) and Transformers to improve the recognition of hand-drawn chemical structures. The model incorporates an EfficientNetV2 CNN encoder that extracts features from hand-drawn images, followed by a Transformer decoder that converts the extracted features into Simplified Molecular Input Line Entry System (SMILES) strings. Our models were trained using synthetic hand-drawn images generated by RanDepict, a tool for depicting chemical structures with different style elements. A benchmark was performed using a real-world dataset of hand-drawn chemical structures to evaluate the model's performance. The results indicate that our improved DECIMER architecture exhibits a significantly enhanced recognition accuracy compared to other approaches. SCIENTIFIC CONTRIBUTION The new DECIMER model presented here refines our previous research efforts and is currently the only open-source model tailored specifically for the recognition of hand-drawn chemical structures. The enhanced model performs better in handling variations in handwriting styles, line thicknesses, and background noise, making it suitable for real-world applications. The DECIMER hand-drawn structure recognition model and its source code have been made available as an open-source package under a permissive license.
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: J Cheminform Año: 2024 Tipo del documento: Article País de afiliación: Alemania Pais de publicación: Reino Unido

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: J Cheminform Año: 2024 Tipo del documento: Article País de afiliación: Alemania Pais de publicación: Reino Unido