Búsqueda | Portal Regional de la BVS

Classification of some active HIV-1 protease inhibitors and their inactive analogues using some uncorrelated three-dimensional molecular descriptors and a fuzzy c-means algorithm.

Lin, Thy-Hou; Wang, Ging-Ming; Hsu, Yao-Hua.

J Chem Inf Comput Sci ; 42(6): 1490-504, 2002.

Artículo en Inglés | MEDLINE | ID: mdl-12444748

RESUMEN

A fuzzy c-means algorithm was used to classify some 3D convex hull descriptors computed for 345 active HIV-1 protease inhibitors collected from literature and 437 inactive analogues searched from the MDL/ISIS database. The number of descriptors used to represent each compound was from 4 to 8, and they were uncorrelated using the principal component analysis. These uncorrelated descriptors were then divided into two groups and classified by the fuzzy c-means algorithm. The classification produced a clear-cut switch in membership functions computed for each uncorrelated descriptor at the group boundary. Compounds with nonswitching membership functions computed were treated as outliers, and they were counted for estimating the accuracy of the classification. The averaged accuracy of classification for the active inhibitor set was about 80% which was better than that directly classified by a linear discriminant function on the original 3D convex hull descriptors. The whole classification scheme was also applied to several sets of some conventional descriptors computed for each compound, but the averaged accuracy was around 58%. Further classification using some 3D convex hull descriptors searched from comparing the distribution of these descriptors was performed on a new data set composed of 289 outliers-deducted active inhibitors and 63 outliers identified from the inactive analogues through previous classification. This final classification identified 19 inactive analogues which were similar in structural and topological features to those of some highly active inhibitors classified together with them.

Asunto(s)

Algoritmos , Inhibidores de la Proteasa del VIH/química , Inhibidores de la Proteasa del VIH/clasificación , Entropía , Inhibidores de la Proteasa del VIH/farmacología , Estructura Molecular , Relación Estructura-Actividad

Prediction of beta-turns in proteins using the first-order Markov models.

Lin, Thy-Hou; Wang, Ging-Ming; Wang, Yen-Tseng.

J Chem Inf Comput Sci ; 42(1): 123-33, 2002.

Artículo en Inglés | MEDLINE | ID: mdl-11855976

RESUMEN

We present a method based on the first-order Markov models for predicting simple beta-turns and loops containing multiple turns in proteins. Sequences of 338 proteins in a database are divided using the published turn criteria into the following three regions, namely, the turn, the boundary, and the nonturn ones. A transition probability matrix is constructed for either the turn or the nonturn region using the weighted transition probabilities computed for dipeptides identified from each region. There are two such matrices constructed for the boundary region since the transition probabilities for dipeptides immediately preceding or following a turn are different. The window used for scanning a protein sequence from amino (N-) to carboxyl (C-) terminal is a hexapeptide since the transition probability computed for a turn tetrapeptide is capped at both the N- and C- termini with a boundary transition probability indexed respectively from the two boundary transition matrices. A sum of the averaged product of the transition probabilities of all the hexapeptides involving each residue is computed. This is then weighted with a probability computed from assuming that all the hexapeptides are from the nonturn region to give the final prediction quantity. Both simple beta-turns and loops containing multiple turns in a protein are then identified by the rising of the prediction quantity computed. The performance of the prediction scheme or the percentage (%) of correct prediction is evaluated through computation of Matthews correlation coefficients for each protein predicted. It is found that the prediction method is capable of giving prediction results with better correlation between the percent of correct prediction and the Matthews correlation coefficients for a group of test proteins as compared with those predicted using some secondary structural prediction methods. The prediction accuracy for about 40% of proteins in the database or 50% of proteins in the test set is better than 70%. Such a percentage for the test set is reduced to 30 if the structures of all the proteins in the set are treated as unknown.

Asunto(s)

Modelos Moleculares , Estructura Secundaria de Proteína , Proteínas/química , Secuencia de Aminoácidos , Cristalografía por Rayos X , Cadenas de Markov , Datos de Secuencia Molecular

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA