Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Sci Rep ; 12(1): 5867, 2022 04 07.
Artículo en Inglés | MEDLINE | ID: mdl-35393450

RESUMEN

SARS-CoV-2 pandemic first emerged in late 2019 in China. It has since infected more than 298 million individuals and caused over 5 million deaths globally. The identification of essential proteins in a protein-protein interaction network (PPIN) is not only crucial in understanding the process of cellular life but also useful in drug discovery. There are many centrality measures to detect influential nodes in complex networks. Since SARS-CoV-2 and (H1N1) influenza PPINs pose 553 common human proteins. Analyzing influential proteins and comparing these networks together can be an effective step in helping biologists for drug-target prediction. We used 21 centrality measures on SARS-CoV-2 and (H1N1) influenza PPINs to identify essential proteins. We applied principal component analysis and unsupervised machine learning methods to reveal the most informative measures. Appealingly, some measures had a high level of contribution in comparison to others in both PPINs, namely Decay, Residual closeness, Markov, Degree, closeness (Latora), Barycenter, Closeness (Freeman), and Lin centralities. We also investigated some graph theory-based properties like the power law, exponential distribution, and robustness. Both PPINs tended to properties of scale-free networks that expose their nature of heterogeneity. Dimensionality reduction and unsupervised learning methods were so effective to uncover appropriate centrality measures.


Asunto(s)
COVID-19 , Subtipo H1N1 del Virus de la Influenza A , Gripe Humana , Humanos , Subtipo H1N1 del Virus de la Influenza A/metabolismo , Mapas de Interacción de Proteínas , Proteínas/metabolismo , SARS-CoV-2
2.
J Bioinform Comput Biol ; 19(2): 2150002, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33657986

RESUMEN

A central problem of systems biology is the reconstruction of Gene Regulatory Networks (GRNs) by the use of time series data. Although many attempts have been made to design an efficient method for GRN inference, providing a best solution is still a challenging task. Existing noise, low number of samples, and high number of nodes are the main reasons causing poor performance of existing methods. The present study applies the ensemble Kalman filter algorithm to model a GRN from gene time series data. The inference of a GRN is decomposed with p genes into p subproblems. In each subproblem, the ensemble Kalman filter algorithm identifies the weight of interactions for each target gene. With the use of the ensemble Kalman filter, the expression pattern of the target gene is predicted from the expression patterns of all the remaining genes. The proposed method is compared with several well-known approaches. The results of the evaluation indicate that the proposed method improves inference accuracy and demonstrates better regulatory relations with noisy data.


Asunto(s)
Algoritmos , Redes Reguladoras de Genes , Biología de Sistemas , Factores de Tiempo
3.
BMC Bioinformatics ; 21(1): 475, 2020 Oct 22.
Artículo en Inglés | MEDLINE | ID: mdl-33092523

RESUMEN

BACKGROUND: Single individual haplotype problem refers to reconstructing haplotypes of an individual based on several input fragments sequenced from a specified chromosome. Solving this problem is an important task in computational biology and has many applications in the pharmaceutical industry, clinical decision-making, and genetic diseases. It is known that solving the problem is NP-hard. Although several methods have been proposed to solve the problem, it is found that most of them have low performances in dealing with noisy input fragments. Therefore, proposing a method which is accurate and scalable, is a challenging task. RESULTS: In this paper, we introduced a method, named NCMHap, which utilizes the Neutrosophic c-means (NCM) clustering algorithm. The NCM algorithm can effectively detect the noise and outliers in the input data. In addition, it can reduce their effects in the clustering process. The proposed method has been evaluated by several benchmark datasets. Comparing with existing methods indicates when NCM is tuned by suitable parameters, the results are encouraging. In particular, when the amount of noise increases, it outperforms the comparing methods. CONCLUSION: The proposed method is validated using simulated and real datasets. The achieved results recommend the application of NCMHap on the datasets which involve the fragments with a huge amount of gaps and noise.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Haplotipos/genética , Secuencia de Bases , Análisis por Conglomerados , Simulación por Computador , Bases de Datos Genéticas , Humanos , Polimorfismo de Nucleótido Simple/genética
4.
PLoS One ; 15(10): e0241291, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33120403

RESUMEN

Decreasing the cost of high-throughput DNA sequencing technologies, provides a huge amount of data that enables researchers to determine haplotypes for diploid and polyploid organisms. Although various methods have been developed to reconstruct haplotypes in diploid form, their accuracy is still a challenging task. Also, most of the current methods cannot be applied to polyploid form. In this paper, an iterative method is proposed, which employs hypergraph to reconstruct haplotype. The proposed method by utilizing chaotic viewpoint can enhance the obtained haplotypes. For this purpose, a haplotype set was randomly generated as an initial estimate, and its consistency with the input fragments was described by constructing a weighted hypergraph. Partitioning the hypergraph specifies those positions in the haplotype set that need to be corrected. This procedure is repeated until no further improvement could be achieved. Each element of the finalized haplotype set is mapped to a line by chaos game representation, and a coordinate series is defined based on the position of mapped points. Then, some positions with low qualities can be assessed by applying a local projection. Experimental results on both simulated and real datasets demonstrate that this method outperforms most other approaches, and is promising to perform the haplotype assembly.


Asunto(s)
Algoritmos , Genoma Humano , Haplotipos , Modelos Genéticos , Análisis de Secuencia de ADN , Humanos
5.
Data Brief ; 32: 106144, 2020 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-32835040

RESUMEN

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the COVID-19 pandemic. It was first detected in China and was rapidly spread to other countries. Several thousands of whole genome sequences of SARS-CoV-2 have been reported and it is important to compare them and identify distinctive evolutionary/mutant markers. Utilizing chaos game representation (CGR) as well as recurrence quantification analysis (RQA) as a powerful nonlinear analysis technique, we proposed an effective process to extract several valuable features from genomic sequences of SARS-CoV-2. The represented features enable us to compare genomic sequences with different lengths. The provided dataset involves totally 18 RQA-based features for 4496 instances of SARS-CoV-2.

6.
Sci Rep ; 9(1): 18580, 2019 12 09.
Artículo en Inglés | MEDLINE | ID: mdl-31819106

RESUMEN

Feature selection problem is one of the most significant issues in data classification. The purpose of feature selection is selection of the least number of features in order to increase accuracy and decrease the cost of data classification. In recent years, due to appearance of high-dimensional datasets with low number of samples, classification models have encountered over-fitting problem. Therefore, the need for feature selection methods that are used to remove the extensions and irrelevant features is felt. Recently, although, various methods have been proposed for selecting the optimal subset of features with high precision, these methods have encountered some problems such as instability, high convergence time, selection of a semi-optimal solution as the final result. In other words, they have not been able to fully extract the effective features. In this paper, a hybrid method based on the IWSSr method and Shuffled Frog Leaping Algorithm (SFLA) is proposed to select effective features in a large-scale gene dataset. The proposed algorithm is implemented in two phases: filtering and wrapping. In the filter phase, the Relief method is used for weighting features. Then, in the wrapping phase, by using the SFLA and the IWSSr algorithms, the search for effective features in a feature-rich area is performed. The proposed method is evaluated by using some standard gene expression datasets. The experimental results approve that the proposed approach in comparison to similar methods, has been achieved a more compact set of features along with high accuracy. The source code and testing datasets are available at https://github.com/jimy2020/SFLA_IWSSr-Feature-Selection.


Asunto(s)
Interpretación Estadística de Datos , Neoplasias/genética , Algoritmos , Simulación por Computador , Bases de Datos Genéticas , Conjuntos de Datos como Asunto , Femenino , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Humanos , Aprendizaje Automático , Masculino , Análisis de Secuencia por Matrices de Oligonucleótidos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Programas Informáticos
7.
Sci Rep ; 9(1): 10361, 2019 07 17.
Artículo en Inglés | MEDLINE | ID: mdl-31316124

RESUMEN

Sequence data are deposited in the form of unphased genotypes and it is not possible to directly identify the location of a particular allele on a specific parental chromosome or haplotype. This study employed nonlinear time series modeling approaches to analyze the haplotype sequences obtained from the NGS sequencing method. To evaluate the chaotic behavior of haplotypes, we analyzed their whole sequences, as well as several subsequences from distinct haplotypes, in terms of the SNP distribution on their chromosomes. This analysis utilized chaos game representation (CGR) followed by the application of two different scaling methods. It was found that chaotic behavior clearly exists in most haplotype subsequences. For testing the applicability of the proposed model, the present research determined the alleles in gap positions and positions with low coverage by using chromosome subsequences in which 10% of each subsequence's alleles are replaced by gaps. After conversion of the subsequences' CGR into the coordinate series, a Local Projection (LP) method predicted the measure of ambiguous positions in the coordinate series. It was discovered that the average reconstruction rate for all input data is more than 97%, demonstrating that applying this knowledge can effectively improve the reconstruction rate of given haplotypes.


Asunto(s)
Mapeo Cromosómico/métodos , Biología Computacional/métodos , Haplotipos , Dinámicas no Lineales , Polimorfismo de Nucleótido Simple , Algoritmos , Alelos , Cromosomas Humanos/genética , Conjuntos de Datos como Asunto , Fractales , Genoma Humano , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA