Your browser doesn't support javascript.
loading
TargetDBP: Accurate DNA-Binding Protein Prediction Via Sequence-Based Multi-View Feature Learning.
IEEE/ACM Trans Comput Biol Bioinform ; 17(4): 1419-1429, 2020.
Article en En | MEDLINE | ID: mdl-30668479
Accurately identifying DNA-binding proteins (DBPs) from protein sequence information is an important but challenging task for protein function annotations. In this paper, we establish a novel computational method, named TargetDBP, for accurately targeting DBPs from primary sequences. In TargetDBP, four single-view features, i.e., AAC (Amino Acid Composition), PsePSSM (Pseudo Position-Specific Scoring Matrix), PsePRSA (Pseudo Predicted Relative Solvent Accessibility), and PsePPDBS (Pseudo Predicted Probabilities of DNA-Binding Sites), are first extracted to represent different base features, respectively. Second, differential evolution algorithm is employed to learn the weights of four base features. Using the learned weights, we weightedly combine these base features to form the original super feature. An excellent subset of the super feature is then selected by using a suitable feature selection algorithm SVM-REF+CBR (Support Vector Machine Recursive Feature Elimination with Correlation Bias Reduction). Finally, the prediction model is learned via using support vector machine on the selected feature subset. We also construct a new gold-standard and non-redundant benchmark dataset from PDB database to evaluate and compare the proposed TargetDBP with other existing predictors. On this new dataset, TargetDBP can achieve higher performance than other state-of-the-art predictors. The TargetDBP web server and datasets are freely available at http://csbio.njust.edu.cn/bioinf/targetdbp/ for academic use.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Biología Computacional / Análisis de Secuencia de Proteína / Proteínas de Unión al ADN / Aprendizaje Automático Tipo de estudio: Prognostic_studies / Risk_factors_studies Idioma: En Revista: ACM Trans Comput Biol Bioinform Asunto de la revista: BIOLOGIA / INFORMATICA MEDICA Año: 2020 Tipo del documento: Article Pais de publicación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Biología Computacional / Análisis de Secuencia de Proteína / Proteínas de Unión al ADN / Aprendizaje Automático Tipo de estudio: Prognostic_studies / Risk_factors_studies Idioma: En Revista: ACM Trans Comput Biol Bioinform Asunto de la revista: BIOLOGIA / INFORMATICA MEDICA Año: 2020 Tipo del documento: Article Pais de publicación: Estados Unidos