A Semi-Supervised Autoencoder-Based Approach for Protein Function Prediction.
IEEE J Biomed Health Inform
; 26(10): 4957-4965, 2022 10.
Article
en En
| MEDLINE
| ID: mdl-35349463
After the development of next-generation sequencing techniques, protein sequences are abundantly available. Determining the functional characteristics of these proteins is costly and time-consuming. The gap between the number of protein sequences and their corresponding functions is continuously increasing. Advanced machine-learning methods have stepped up to fill this gap. In this work, an advanced deep-learning-based approach is proposed for protein function prediction using protein sequences. A set of autoencoders is trained in a semi-supervised manner with protein sequences. Each autoencoder corresponds to a single protein function only. In particular, 932 autoencoders corresponding to 932 biological processes and 585 autoencoders corresponding to 585 molecular functions are trained separately. Reconstruction losses of each protein sample for every autoencoder are used as a feature to classify these sequences into their corresponding functions. The proposed model is tested on test protein samples and achieves promising results. This method can be easily extended to predict any number of functions having an ample amount of supporting protein sequences. All relevant codes, data and trained models are available at https://github.com/richadhanuka/PFP-Autoencoders.
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Asunto principal:
Proteínas
/
Aprendizaje Automático
Tipo de estudio:
Prognostic_studies
/
Risk_factors_studies
Límite:
Humans
Idioma:
En
Revista:
IEEE J Biomed Health Inform
Año:
2022
Tipo del documento:
Article
Pais de publicación:
Estados Unidos