Decision tree-based identification of important molecular fragments for protein-ligand binding.
Chem Biol Drug Des
; 103(1): e14427, 2024 01.
Article
en En
| MEDLINE
| ID: mdl-38230776
ABSTRACT
Fragment-based drug design is an emerging technology in pharmaceutical research and development. One of the key aspects of this technology is the identification and quantitative characterization of molecular fragments. This study presents a strategy for identifying important molecular fragments based on molecular fingerprints and decision tree algorithms and verifies its feasibility in predicting protein-ligand binding affinity. Specifically, the three-dimensional (3D) structures of protein-ligand complexes are encoded using extended-connectivity fingerprints (ECFP), and three decision tree models, namely Random Forest, XGBoost, and LightGBM, are used to quantitatively characterize the feature importance, thereby extracting important molecular fragments with high reliability. Few-shot learning reveals that the extracted molecular fragments contribute significantly and consistently to the binding affinity even with a small sample size. Despite the absence of location and distance information for molecular fragments in ECFP, 3D visualization, in combination with the reverse ECFP process, shows that the majority of the extracted fragments are located at the binding interface of the protein and the ligand. This alignment with the distance constraints critical for binding affinity further supports the reliability of the strategy for identifying important molecular fragments.
Palabras clave
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Asunto principal:
Proteínas
Tipo de estudio:
Diagnostic_studies
/
Health_economic_evaluation
/
Prognostic_studies
Idioma:
En
Revista:
Chem Biol Drug Des
Asunto de la revista:
BIOQUIMICA
/
FARMACIA
/
FARMACOLOGIA
Año:
2024
Tipo del documento:
Article
País de afiliación:
China
Pais de publicación:
Reino Unido