Búsqueda | Portal Regional de la BVS

1.

Point estimation and related classification problems for several Lindley populations with application using COVID-19 data.

Bal, Debasmita; Tripathy, Manas Ranjan; Kumar, Somesh.

J Appl Stat ; 51(10): 1976-2006, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-39071252

RESUMEN

The problems of point estimation and classification under the assumption that the training data follow a Lindley distribution are considered. Bayes estimators are derived for the parameter of the Lindley distribution applying the Markov chain Monte Carlo (MCMC), and Tierney and Kadane's [Tierney and Kadane, Accurate approximations for posterior moments and marginal densities, J. Amer. Statist. Assoc. 81 (1986), pp. 82-86] methods. In the sequel, we prove that the Bayes estimators using Tierney and Kadane's approximation and Lindley's approximation both converge to the maximum likelihood estimator (MLE), as n â ∞ , where n is the sample size. The performances of all the proposed estimators are compared with some of the existing ones using bias and mean squared error (MSE), numerically. It has been noticed from our simulation study that the proposed estimators perform better than some of the existing ones. Applying these estimators, we construct several plug-in type classification rules and a rule that uses the likelihood accordance function. The performances of each of the rules are numerically evaluated using the expected probability of misclassification (EPM). Two real-life examples related to COVID-19 disease are considered for illustrative purposes.

2.

Mechanics-based classification rule for plants.

Kanahama, Tohya; Sato, Motohiro.

Proc Natl Acad Sci U S A ; 120(41): e2308319120, 2023 10 10.

Artículo en Inglés | MEDLINE | ID: mdl-37801474

RESUMEN

The height of thick and solid plants, such as woody plants, is proportional to two-thirds of the power of their diameter at breast height. However, this rule cannot be applied to herbaceous plants that are thin and soft because the mechanisms supporting their bodies are fundamentally different. This study aims to clarify the rigidity control mechanism resulting from turgor pressure caused by internal water in herbaceous plants to formulate the corresponding scaling law. We modeled a herbaceous plant as a cantilever with the ground side as a fixed end, and the greatest height was formulated considering the axial tension force from the turgor pressure. The scaling law describing the relationship between the height and diameter in terms of the turgor pressure was theoretically derived. Moreover, we proposed a plant classification rule based on stress distribution.

Asunto(s)

Plantas , Madera

3.

Immune responses of different COVID-19 vaccination strategies by analyzing single-cell RNA sequencing data from multiple tissues using machine learning methods.

Li, Hao; Ma, Qinglan; Ren, Jingxin; Guo, Wei; Feng, Kaiyan; Li, Zhandong; Huang, Tao; Cai, Yu-Dong.

Front Genet ; 14: 1157305, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-37007947

RESUMEN

Multiple types of COVID-19 vaccines have been shown to be highly effective in preventing SARS-CoV-2 infection and in reducing post-infection symptoms. Almost all of these vaccines induce systemic immune responses, but differences in immune responses induced by different vaccination regimens are evident. This study aimed to reveal the differences in immune gene expression levels of different target cells under different vaccine strategies after SARS-CoV-2 infection in hamsters. A machine learning based process was designed to analyze single-cell transcriptomic data of different cell types from the blood, lung, and nasal mucosa of hamsters infected with SARS-CoV-2, including B and T cells from the blood and nasal cavity, macrophages from the lung and nasal cavity, alveolar epithelial and lung endothelial cells. The cohort was divided into five groups: non-vaccinated (control), 2*adenovirus (two doses of adenovirus vaccine), 2*attenuated (two doses of attenuated virus vaccine), 2*mRNA (two doses of mRNA vaccine), and mRNA/attenuated (primed by mRNA vaccine, boosted by attenuated vaccine). All genes were ranked using five signature ranking methods (LASSO, LightGBM, Monte Carlo feature selection, mRMR, and permutation feature importance). Some key genes that contributed to the analysis of immune changes, such as RPS23, DDX5, PFN1 in immune cells, and IRF9 and MX1 in tissue cells, were screened. Afterward, the five feature sorting lists were fed into the feature incremental selection framework, which contained two classification algorithms (decision tree [DT] and random forest [RF]), to construct optimal classifiers and generate quantitative rules. Results showed that random forest classifiers could provide relative higher performance than decision tree classifiers, whereas the DT classifiers provided quantitative rules that indicated special gene expression levels under different vaccine strategies. These findings may help us to develop better protective vaccination programs and new vaccines.

4.

Identifying anal and cervical tumorigenesis-associated methylation signaling with machine learning methods.

Jian, Fangfang; Huang, FeiMing; Zhang, Yu-Hang; Huang, Tao; Cai, Yu-Dong.

Front Oncol ; 12: 998032, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-36249027

RESUMEN

Cervical and anal carcinoma are neoplastic diseases with various intraepithelial neoplasia stages. The underlying mechanisms for cancer initiation and progression have not been fully revealed. DNA methylation has been shown to be aberrantly regulated during tumorigenesis in anal and cervical carcinoma, revealing the important roles of DNA methylation signaling as a biomarker to distinguish cancer stages in clinics. In this research, several machine learning methods were used to analyze the methylation profiles on anal and cervical carcinoma samples, which were divided into three classes representing various stages of tumor progression. Advanced feature selection methods, including Boruta, LASSO, LightGBM, and MCFS, were used to select methylation features that are highly correlated with cancer progression. Some methylation probes including cg01550828 and its corresponding gene RNF168 have been reported to be associated with human papilloma virus-related anal cancer. As for biomarkers for cervical carcinoma, cg27012396 and its functional gene HDAC4 were confirmed to regulate the glycolysis and survival of hypoxic tumor cells in cervical carcinoma. Furthermore, we developed effective classifiers for identifying various tumor stages and derived classification rules that reflect the quantitative impact of methylation on tumorigenesis. The current study identified methylation signals associated with the development of cervical and anal carcinoma at qualitative and quantitative levels using advanced machine learning methods.

5.

Identification of methylation signatures and rules for predicting the severity of SARS-CoV-2 infection with machine learning methods.

Liu, Zhiyang; Meng, Mei; Ding, ShiJian; Zhou, XiaoChao; Feng, KaiYan; Huang, Tao; Cai, Yu-Dong.

Front Microbiol ; 13: 1007295, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-36212830

RESUMEN

Patients infected with SARS-CoV-2 at various severities have different clinical manifestations and treatments. Mild or moderate patients usually recover with conventional medical treatment, but severe patients require prompt professional treatment. Thus, stratifying infected patients for targeted treatment is meaningful. A computational workflow was designed in this study to identify key blood methylation features and rules that can distinguish the severity of SARS-CoV-2 infection. First, the methylation features in the expression profile were deeply analyzed by a Monte Carlo feature selection method. A feature list was generated. Next, this ranked feature list was fed into the incremental feature selection method to determine the optimal features for different classification algorithms, thereby further building optimal classifiers. These selected key features were analyzed by functional enrichment to detect their biofunctional information. Furthermore, a set of rules were set up by a white-box algorithm, decision tree, to uncover different methylation patterns on various severity of SARS-CoV-2 infection. Some genes (PARP9, MX1, IRF7), corresponding to essential methylation sites, and rules were validated by published academic literature. Overall, this study contributes to revealing potential expression features and provides a reference for patient stratification. The physicians can prioritize and allocate health and medical resources for COVID-19 patients based on their predicted severe clinical outcomes.

6.

Identification of methylation signatures associated with CAR T cell in B-cell acute lymphoblastic leukemia and non-hodgkin's lymphoma.

Song, Jiwei; Huang, FeiMing; Chen, Lei; Feng, KaiYan; Jian, Fangfang; Huang, Tao; Cai, Yu-Dong.

Front Oncol ; 12: 976262, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-36033519

RESUMEN

CD19-targeted CAR T cell immunotherapy has exceptional efficacy for the treatment of B-cell malignancies. B-cell acute lymphocytic leukemia and non-Hodgkin's lymphoma are two common B-cell malignancies with high recurrence rate and are refractory to cure. Although CAR T-cell immunotherapy overcomes the limitations of conventional treatments for such malignancies, failure of treatment and tumor recurrence remain common. In this study, we searched for important methylation signatures to differentiate CAR-transduced and untransduced T cells from patients with acute lymphoblastic leukemia and non-Hodgkin's lymphoma. First, we used three feature ranking methods, namely, Monte Carlo feature selection, light gradient boosting machine, and least absolute shrinkage and selection operator, to rank all methylation features in order of their importance. Then, the incremental feature selection method was adopted to construct efficient classifiers and filter the optimal feature subsets. Some important methylated genes, namely, SERPINB6, ANK1, PDCD5, DAPK2, and DNAJB6, were identified. Furthermore, the classification rules for distinguishing different classes were established, which can precisely describe the role of methylation features in the classification. Overall, we applied advanced machine learning approaches to the high-throughput data, investigating the mechanism of CAR T cells to establish the theoretical foundation for modifying CAR T cells.

7.

Error rate control for classification rules in multiclass mixture models.

Mary-Huard, Tristan; Perduca, Vittorio; Martin-Magniette, Marie-Laure; Blanchard, Gilles.

Int J Biostat ; 18(2): 381-396, 2022 11 01.

Artículo en Inglés | MEDLINE | ID: mdl-34845884

RESUMEN

In the context of finite mixture models one considers the problem of classifying as many observations as possible in the classes of interest while controlling the classification error rate in these same classes. Similar to what is done in the framework of statistical test theory, different type I and type II-like classification error rates can be defined, along with their associated optimal rules, where optimality is defined as minimizing type II error rate while controlling type I error rate at some nominal level. It is first shown that finding an optimal classification rule boils down to searching an optimal region in the observation space where to apply the classical Maximum A Posteriori (MAP) rule. Depending on the misclassification rate to be controlled, the shape of the optimal region is provided, along with a heuristic to compute the optimal classification rule in practice. In particular, a multiclass FDR-like optimal rule is defined and compared to the thresholded MAP rules that is used in most applications. It is shown on both simulated and real datasets that the FDR-like optimal rule may be significantly less conservative than the thresholded MAP rule.

Asunto(s)

Algoritmos

8.

Identifying Transcriptomic Signatures and Rules for SARS-CoV-2 Infection.

Zhang, Yu-Hang; Li, Hao; Zeng, Tao; Chen, Lei; Li, Zhandong; Huang, Tao; Cai, Yu-Dong.

Front Cell Dev Biol ; 8: 627302, 2020.

Artículo en Inglés | MEDLINE | ID: mdl-33505977

RESUMEN

The world-wide Coronavirus Disease 2019 (COVID-19) pandemic was triggered by the widespread of a new strain of coronavirus named as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Multiple studies on the pathogenesis of SARS-CoV-2 have been conducted immediately after the spread of the disease. However, the molecular pathogenesis of the virus and related diseases has still not been fully revealed. In this study, we attempted to identify new transcriptomic signatures as candidate diagnostic models for clinical testing or as therapeutic targets for vaccine design. Using the recently reported transcriptomics data of upper airway tissue with acute respiratory illnesses, we integrated multiple machine learning methods to identify effective qualitative biomarkers and quantitative rules for the distinction of SARS-CoV-2 infection from other infectious diseases. The transcriptomics data was first analyzed by Boruta so that important features were selected, which were further evaluated by the minimum redundancy maximum relevance method. A feature list was produced. This list was fed into the incremental feature selection, incorporating some classification algorithms, to extract qualitative biomarker genes and construct quantitative rules. Also, an efficient classifier was built to identify patients infected with SARS-COV-2. The findings reported in this study may help in revealing the potential pathogenic mechanisms of COVID-19 and finding new targets for vaccine design.

9.

DQB: A novel dynamic quantitive classification model using artificial bee colony algorithm with application on gene expression profiles.

Alshamlan, Hala M.

Saudi J Biol Sci ; 25(5): 932-946, 2018 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-30108444

RESUMEN

In the medical domain, it is very significant to develop a rule-based classification model. This is because it has the ability to produce a comprehensible and understandable model that accounts for the predictions. Moreover, it is desirable to know not only the classification decisions but also what leads to these decisions. In this paper, we propose a novel dynamic quantitative rule-based classification model, namely DQB, which integrates quantitative association rule mining and the Artificial Bee Colony (ABC) algorithm to provide users with more convenience in terms of understandability and interpretability via an accurate class quantitative association rule-based classifier model. As far as we know, this is the first attempt to apply the ABC algorithm in mining for quantitative rule-based classifier models. In addition, this is the first attempt to use quantitative rule-based classification models for classifying microarray gene expression profiles. Also, in this research we developed a new dynamic local search strategy named DLS, which is improved the local search for artificial bee colony (ABC) algorithm. The performance of the proposed model has been compared with well-known quantitative-based classification methods and bio-inspired meta-heuristic classification algorithms, using six gene expression profiles for binary and multi-class cancer datasets. From the results, it can be concludes that a considerable increase in classification accuracy is obtained for the DQB when compared to other available algorithms in the literature, and it is able to provide an interpretable model for biologists. This confirms the significance of the proposed algorithm in the constructing a classifier rule-based model, and accordingly proofs that these rules obtain a highly qualified and meaningful knowledge extracted from the training set, where all subset of quantitive rules report close to 100% classification accuracy with a minimum number of genes. It is remarkable that apparently (to the best of our knowledge) several new genes were discovered that have not been seen in any past studies. For the applicability demand, based on the results acqured from microarray gene expression analysis, we can conclude that DQB can be adopted in a different real world applications with some modifications.

10.

Naïve Bayesian Classifier and Genetic Risk Score for Genetic Risk Prediction of a Categorical Trait: Not so Different after all!

Sebastiani, Paola; Solovieff, Nadia; Sun, Jenny X.

Front Genet ; 3: 26, 2012.

Artículo en Inglés | MEDLINE | ID: mdl-22393331

RESUMEN

One of the most popular modeling approaches to genetic risk prediction is to use a summary of risk alleles in the form of an unweighted or a weighted genetic risk score, with weights that relate to the odds for the phenotype in carriers of the individual alleles. Recent contributions have proposed the use of Bayesian classification rules using Naïve Bayes classifiers. We examine the relation between the two approaches for genetic risk prediction and show that the methods are mathematically related. In addition, we study the properties of the two approaches and describe how they can be generalized to include various models of inheritance.

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA