Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 118
Filtrar
1.
Magn Reson Med ; 2024 Sep 13.
Artículo en Inglés | MEDLINE | ID: mdl-39270056

RESUMEN

PURPOSE: To shorten CEST acquisition time by leveraging Z-spectrum undersampling combined with deep learning for CEST map construction from undersampled Z-spectra. METHODS: Fisher information gain analysis identified optimal frequency offsets (termed "Fisher offsets") for the multi-pool fitting model, maximizing information gain for the amplitude and the FWHM parameters. These offsets guided initial subsampling levels. A U-NET, trained on undersampled brain CEST images from 18 volunteers, produced CEST maps at 3 T with varied undersampling levels. Feasibility was first tested using retrospective undersampling at three levels, followed by prospective in vivo undersampling (15 of 53 offsets), reducing scan time significantly. Additionally, glioblastoma grade IV pathology was simulated to evaluate network performance in patient-like cases. RESULTS: Traditional multi-pool models failed to quantify CEST maps from undersampled images (structural similarity index [SSIM] <0.2, peak SNR <20, Pearson r <0.1). Conversely, U-NET fitting successfully addressed undersampled data challenges. The study suggests CEST scan time reduction is feasible by undersampling 15, 25, or 35 of 53 Z-spectrum offsets. Prospective undersampling cut scan time by 3.5 times, with a maximum mean squared error of 4.4e-4, r = 0.82, and SSIM = 0.84, compared to the ground truth. The network also reliably predicted CEST values for simulated glioblastoma pathology. CONCLUSION: The U-NET architecture effectively quantifies CEST maps from undersampled Z-spectra at various undersampling levels.

2.
Entropy (Basel) ; 26(8)2024 Jul 23.
Artículo en Inglés | MEDLINE | ID: mdl-39202089

RESUMEN

In this paper, we are concerned with the process of experimental information gain. Building on previous work, we show that this is a discontinuous process in which the initiating quantum-mechanical matter-instrument interactions are being turned into macroscopically observable events (EOs). In the course of time, such EOs evolve into spatio-temporal patterns of EOs, which allow conceivable alternatives of physical explanation to be distinguished. Focusing on the specific case of photon detection, we show that during their lifetimes, EOs proceed through the four phases of initiation, detection, erasure and reset. Once generated, the observational value of EOs can be measured in units of the Planck quantum of physical action h=4.136×10-15eVs. Once terminated, each unit of entropy of size kB=8.617×10-5eV/K, which had been created in the instrument during the observational phase, needs to be removed from the instrument to ready it for a new round of photon detection. This withdrawal of entropy takes place at an energetic cost of at least two units of the Landauer minimum energy bound of ELa=ln⁡2kBTD for each unit of entropy of size kB.

3.
Entropy (Basel) ; 26(8)2024 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-39202095

RESUMEN

As a severe inflammatory response syndrome, sepsis presents complex challenges in predicting patient outcomes due to its unclear pathogenesis and the unstable discharge status of affected individuals. In this study, we develop a machine learning-based method for predicting the discharge status of sepsis patients, aiming to improve treatment decisions. To enhance the robustness of our analysis against outliers, we incorporate robust statistical methods, specifically the minimum covariance determinant technique. We utilize the random forest imputation method to effectively manage and impute missing data. For feature selection, we employ Lasso penalized logistic regression, which efficiently identifies significant predictors and reduces model complexity, setting the stage for the application of more complex predictive methods. Our predictive analysis incorporates multiple machine learning methods, including random forest, support vector machine, and XGBoost. We compare the prediction performance of these methods with Lasso penalized logistic regression to identify the most effective approach. Each method's performance is rigorously evaluated through ten iterations of 10-fold cross-validation to ensure robust and reliable results. Our comparative analysis reveals that XGBoost surpasses the other models, demonstrating its exceptional capability to navigate the complexities of sepsis data effectively.

4.
Sensors (Basel) ; 24(16)2024 Aug 12.
Artículo en Inglés | MEDLINE | ID: mdl-39204919

RESUMEN

With the rapid advancement of the Internet of Things, network security has garnered increasing attention from researchers. Applying deep learning (DL) has significantly enhanced the performance of Network Intrusion Detection Systems (NIDSs). However, due to its complexity and "black box" problem, deploying DL-based NIDS models in practical scenarios poses several challenges, including model interpretability and being lightweight. Feature selection (FS) in DL models plays a crucial role in minimizing model parameters and decreasing computational overheads while enhancing NIDS performance. Hence, selecting effective features remains a pivotal concern for NIDSs. In light of this, this paper proposes an interpretable feature selection method for encrypted traffic intrusion detection based on SHAP and causality principles. This approach utilizes the results of model interpretation for feature selection to reduce feature count while ensuring model reliability. We evaluate and validate our proposed method on two public network traffic datasets, CICIDS2017 and NSL-KDD, employing both a CNN and a random forest (RF). Experimental results demonstrate superior performance achieved by our proposed method.

5.
Entropy (Basel) ; 26(5)2024 Apr 30.
Artículo en Inglés | MEDLINE | ID: mdl-38785638

RESUMEN

Traffic state classification and relevance calculation at intersections are both difficult problems in traffic control. In this paper, we propose an intersection relevance model based on a temporal graph attention network, which can solve the above two problems at the same time. First, the intersection features and interaction time of the intersections are regarded as input quantities together with the initial labels of the traffic data. Then, they are inputted into the temporal graph attention (TGAT) model to obtain the classification accuracy of the target intersections in four states-free, stable, slow moving, and congested-and the obtained neighbouring intersection weights are used as the correlation between the intersections. Finally, it is validated by VISSIM simulation experiments. In terms of classification accuracy, the TGAT model has a higher classification accuracy than the three traditional classification models and can cope well with the uneven distribution of the number of samples. The information gain algorithm from the information entropy theory was used to derive the average delay as the most influential factor on intersection status. The correlation from the TGAT model positively correlates with traffic flow, making it interpretable. Using this correlation to control the division of subareas improves the road network's operational efficiency more than the traditional correlation model does. This demonstrates the effectiveness of the TGAT model's correlation.

6.
BMC Biol ; 22(1): 86, 2024 Apr 19.
Artículo en Inglés | MEDLINE | ID: mdl-38637801

RESUMEN

BACKGROUND: The blood-brain barrier serves as a critical interface between the bloodstream and brain tissue, mainly composed of pericytes, neurons, endothelial cells, and tightly connected basal membranes. It plays a pivotal role in safeguarding brain from harmful substances, thus protecting the integrity of the nervous system and preserving overall brain homeostasis. However, this remarkable selective transmission also poses a formidable challenge in the realm of central nervous system diseases treatment, hindering the delivery of large-molecule drugs into the brain. In response to this challenge, many researchers have devoted themselves to developing drug delivery systems capable of breaching the blood-brain barrier. Among these, blood-brain barrier penetrating peptides have emerged as promising candidates. These peptides had the advantages of high biosafety, ease of synthesis, and exceptional penetration efficiency, making them an effective drug delivery solution. While previous studies have developed a few prediction models for blood-brain barrier penetrating peptides, their performance has often been hampered by issue of limited positive data. RESULTS: In this study, we present Augur, a novel prediction model using borderline-SMOTE-based data augmentation and machine learning. we extract highly interpretable physicochemical properties of blood-brain barrier penetrating peptides while solving the issues of small sample size and imbalance of positive and negative samples. Experimental results demonstrate the superior prediction performance of Augur with an AUC value of 0.932 on the training set and 0.931 on the independent test set. CONCLUSIONS: This newly developed Augur model demonstrates superior performance in predicting blood-brain barrier penetrating peptides, offering valuable insights for drug development targeting neurological disorders. This breakthrough may enhance the efficiency of peptide-based drug discovery and pave the way for innovative treatment strategies for central nervous system diseases.


Asunto(s)
Péptidos de Penetración Celular , Enfermedades del Sistema Nervioso Central , Humanos , Barrera Hematoencefálica/química , Células Endoteliales , Péptidos de Penetración Celular/química , Péptidos de Penetración Celular/farmacología , Péptidos de Penetración Celular/uso terapéutico , Encéfalo , Enfermedades del Sistema Nervioso Central/tratamiento farmacológico
7.
Entropy (Basel) ; 26(3)2024 Mar 13.
Artículo en Inglés | MEDLINE | ID: mdl-38539766

RESUMEN

It is argued that all physical knowledge ultimately stems from observation and that the simplest possible observation is that an event has happened at a certain space-time location X→=x→,t. Considering historic experiments, which have been groundbreaking in the evolution of our modern ideas of matter on the atomic, nuclear, and elementary particle scales, it is shown that such experiments produce as outputs streams of macroscopically observable events which accumulate in the course of time into spatio-temporal patterns of events whose forms allow decisions to be taken concerning conceivable alternatives of explanation. Working towards elucidating the physical and informational characteristics of those elementary observations, we show that these represent hugely amplified images of the initiating micro-events and that the resulting macro-images have a cognitive value of 1 bit and a physical value of Wobs=Eobsτobs≫h. In this latter equation, Eobs stands for the energy spent in turning the initiating micro-events into macroscopically observable events, τobs for the lifetimes during which the generated events remain macroscopically observable, and h for Planck's constant. The relative value Gobs=Wobs/h finally represents a measure of amplification that was gained in the observation process.

8.
Biostatistics ; 25(3): 833-851, 2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-38330084

RESUMEN

The development and evaluation of novel treatment combinations is a key component of modern clinical research. The primary goals of factorial clinical trials of treatment combinations range from the estimation of intervention-specific effects, or the discovery of potential synergies, to the identification of combinations with the highest response probabilities. Most factorial studies use balanced or block randomization, with an equal number of patients assigned to each treatment combination, irrespective of the specific goals of the trial. Here, we introduce a class of Bayesian response-adaptive designs for factorial clinical trials with binary outcomes. The study design was developed using Bayesian decision-theoretic arguments and adapts the randomization probabilities to treatment combinations during the enrollment period based on the available data. Our approach enables the investigator to specify a utility function representative of the aims of the trial, and the Bayesian response-adaptive randomization algorithm aims to maximize this utility function. We considered several utility functions and factorial designs tailored to them. Then, we conducted a comparative simulation study to illustrate relevant differences of key operating characteristics across the resulting designs. We also investigated the asymptotic behavior of the proposed adaptive designs. We also used data summaries from three recent factorial trials in perioperative care, smoking cessation, and infectious disease prevention to define realistic simulation scenarios and illustrate advantages of the introduced trial designs compared to other study designs.


Asunto(s)
Teorema de Bayes , Humanos , Incertidumbre , Proyectos de Investigación , Ensayos Clínicos Controlados Aleatorios como Asunto/métodos , Ensayos Clínicos Controlados Aleatorios como Asunto/estadística & datos numéricos , Ensayos Clínicos como Asunto/métodos , Modelos Estadísticos , Algoritmos
9.
Dev Sci ; 27(1): e13411, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37211720

RESUMEN

What drives children to explore and learn when external rewards are uncertain or absent? Across three studies, we tested whether information gain itself acts as an internal reward and suffices to motivate children's actions. We measured 24-56-month-olds' persistence in a game where they had to search for an object (animal or toy), which they never find, hidden behind a series of doors, manipulating the degree of uncertainty about which specific object was hidden. We found that children were more persistent in their search when there was higher uncertainty, and therefore, more information to be gained with each action, highlighting the importance of research on artificial intelligence to invest in curiosity-driven algorithms. RESEARCH HIGHLIGHTS: Across three studies, we tested whether information gain itself acts as an internal reward and suffices to motivate preschoolers' actions. We measured preschoolers' persistence when searching for an object behind a series of doors, manipulating the uncertainty about which specific object was hidden. We found that preschoolers were more persistent when there was higher uncertainty, and therefore, more information to be gained with each action. Our results highlight the importance of research on artificial intelligence to invest in curiosity-driven algorithms.


Asunto(s)
Inteligencia Artificial , Aprendizaje , Niño , Humanos , Conducta Exploratoria , Incertidumbre , Recompensa
10.
Cogn Sci ; 47(12): e13396, 2023 12.
Artículo en Inglés | MEDLINE | ID: mdl-38142430

RESUMEN

In recent years, a multitude of datasets of human-human conversations has been released for the main purpose of training conversational agents based on data-hungry artificial neural networks. In this paper, we argue that datasets of this sort represent a useful and underexplored source to validate, complement, and enhance cognitive studies on human behavior and language use. We present a method that leverages the recent development of powerful computational models to obtain the fine-grained annotation required to apply metrics and techniques from Cognitive Science to large datasets. Previous work in Cognitive Science has investigated the question-asking strategies of human participants by employing different variants of the so-called 20-question-game setting and proposing several evaluation methods. In our work, we focus on GuessWhat, a task proposed within the Computer Vision and Natural Language Processing communities that is similar in structure to the 20-question-game setting. Crucially, the GuessWhat dataset contains tens of thousands of dialogues based on real-world images, making it a suitable setting to investigate the question-asking strategies of human players on a large scale and in a natural setting. Our results demonstrate the effectiveness of computational tools to automatically code how the hypothesis space changes throughout the dialogue in complex visual scenes. On the one hand, we confirm findings from previous work on smaller and more controlled settings. On the other hand, our analyses allow us to highlight the presence of "uninformative" questions (in terms of Expected Information Gain) at specific rounds of the dialogue. We hypothesize that these questions fulfill pragmatic constraints that are exploited by human players to solve visual tasks in complex scenes successfully. Our work illustrates a method that brings together efforts and findings from different disciplines to gain a better understanding of human question-asking strategies on large-scale datasets, while at the same time posing new questions about the development of conversational systems.


Asunto(s)
Comunicación , Procesamiento de Lenguaje Natural , Humanos , Redes Neurales de la Computación
11.
Open Mind (Camb) ; 7: 855-878, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37946850

RESUMEN

Self-directed exploration in childhood appears driven by a desire to resolve uncertainties in order to learn more about the world. However, in adult decision-making, the choice to explore new information rather than exploit what is already known takes many factors beyond uncertainty (such as expected utilities and costs) into account. The evidence for whether young children are sensitive to complex, contextual factors in making exploration decisions is limited and mixed. Here, we investigate whether modifying uncertain options influences explore-exploit behavior in preschool-aged children (48-68 months). Over the course of three experiments, we manipulate uncertain options' ambiguity, expected value, and potential to improve epistemic state for future exploration in a novel forced-choice design. We find evidence that young children are influenced by each of these factors, suggesting that early, self-directed exploration involves sophisticated, context-sensitive decision-making under uncertainty.

12.
Biom J ; 65(8): e2200301, 2023 12.
Artículo en Inglés | MEDLINE | ID: mdl-37816142

RESUMEN

Theoretical-information approach applied to the clinical trial designs appeared to bring several advantages when tackling a problem of finding a balance between power and expected number of successes (ENS). In particular, it was shown that the built-in parameter of the weight function allows finding the desired trade-off between the statistical power and number of treated patients in the context of small population Phase II clinical trials. However, in real clinical trials, randomized designs are more preferable. The goal of this research is to introduce randomization to a deterministic entropy-based sequential trial procedure generalized to multiarm setting. Several methods of randomization applied to an entropy-based design are investigated in terms of statistical power and ENS. Namely, the four design types are considered: (a) deterministic procedures, (b) naive randomization using the inverse of entropy criteria as weights, (c) block randomization, and (d) randomized penalty parameter. The randomized entropy-based designs are compared to randomized Gittins index (GI) and fixed randomization (FR). After the comprehensive simulation study, the following conclusion on block randomization is made: for both entropy-based and GI-based block randomization designs the degree of randomization induced by forward-looking procedures is insufficient to achieve a decent statistical power. Therefore, we propose an adjustment for the forward-looking procedure that improves power with almost no cost in terms of ENS. In addition, the properties of randomization procedures based on randomly drawn penalty parameter are also thoroughly investigated.


Asunto(s)
Proyectos de Investigación , Humanos , Distribución Aleatoria , Simulación por Computador , Tamaño de la Muestra
13.
PeerJ Comput Sci ; 9: e1486, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37705665

RESUMEN

In order to optimize the integration of English multimedia resources and achieve the goal of sharing English teaching resources in education, this article reconstructs the traditional college English curriculum system. It divides professional English into learning modules according to different majors integrating public health teaching resources. How optimize the integration of English multimedia resources and achieving the goal of sharing English teaching resources (ETR) is the main direction of English teaching reform during the current COVID-19 pandemic. An English multimedia teaching resource-sharing platform is designed to extract feature items from multimedia teaching resources using the ID3 information gain method and construct a decision tree for resource push. In resource sharing, a structured peer-to-peer network is used to manage nodes, query location and share multimedia teaching resources. The optimal gateway node is selected by calculating the distance between each gateway node and the fixed node. Finally, a collaborative filtering (CF) algorithm recommends Multimedia ETR to different users. The simulation results show that the platform can improve the sharing speed and utilization rate of teaching resources, with maximum throughput reaching 12 Mb/s and achieve accurate recommendations of ETR.

14.
Entropy (Basel) ; 25(6)2023 Jun 13.
Artículo en Inglés | MEDLINE | ID: mdl-37372279

RESUMEN

Currently, sentiment analysis is a research hotspot in many fields such as computer science and statistical science. Topic discovery of the literature in the field of text sentiment analysis aims to provide scholars with a quick and effective understanding of its research trends. In this paper, we propose a new model for the topic discovery analysis of literature. Firstly, the FastText model is applied to calculate the word vector of literature keywords, based on which cosine similarity is applied to calculate keyword similarity, to carry out the merging of synonymous keywords. Secondly, the hierarchical clustering method based on the Jaccard coefficient is used to cluster the domain literature and count the literature volume of each topic. Thirdly, the information gain method is applied to extract the high information gain characteristic words of various topics, based on which the connotation of each topic is condensed. Finally, by conducting a time series analysis of the literature, a four-quadrant matrix of topic distribution is constructed to compare the research trends of each topic within different stages. The 1186 articles in the field of text sentiment analysis from 2012 to 2022 can be divided into 12 categories. By comparing and analyzing the topic distribution matrices of the two phases of 2012 to 2016 and 2017 to 2022, it is found that the various categories of topics have obvious research development changes in different phases. The results show that: ① Among the 12 categories, online opinion analysis of social media comments represented by microblogs is one of the current hot topics. ② The integration and application of methods such as sentiment lexicon, traditional machine learning and deep learning should be enhanced. ③ Semantic disambiguation of aspect-level sentiment analysis is one of the current difficult problems this field faces. ④ Research on multimodal sentiment analysis and cross-modal sentiment analysis should be promoted.

15.
Acta Crystallogr D Struct Biol ; 79(Pt 4): 271-280, 2023 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-36920335

RESUMEN

Fast, reliable docking of models into cryo-EM maps requires understanding of the errors in the maps and the models. Likelihood-based approaches to errors have proven to be powerful and adaptable in experimental structural biology, finding applications in both crystallography and cryo-EM. Indeed, previous crystallographic work on the errors in structural models is directly applicable to likelihood targets in cryo-EM. Likelihood targets in Fourier space are derived here to characterize, based on the comparison of half-maps, the direction- and resolution-dependent variation in the strength of both signal and noise in the data. Because the signal depends on local features, the signal and noise are analysed in local regions of the cryo-EM reconstruction. The likelihood analysis extends to prediction of the signal that will be achieved in any docking calculation for a model of specified quality and completeness. A related calculation generalizes a previous measure of the information gained by making the cryo-EM reconstruction.


Asunto(s)
Microscopía por Crioelectrón , Funciones de Verosimilitud , Modelos Moleculares , Cristalografía
16.
Acta Crystallogr D Struct Biol ; 79(Pt 4): 281-289, 2023 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-36920336

RESUMEN

Optimized docking of models into cryo-EM maps requires exploiting an understanding of the signal expected in the data to minimize the calculation time while maintaining sufficient signal. The likelihood-based rotation function used in crystallography can be employed to establish plausible orientations in a docking search. A phased likelihood translation function yields scores for the placement and rigid-body refinement of oriented models. Optimized strategies for choices of the resolution of data from the cryo-EM maps to use in the calculations and the size of search volumes are based on expected log-likelihood-gain scores computed in advance of the search calculation. Tests demonstrate that the new procedure is fast, robust and effective at placing models into even challenging cryo-EM maps.


Asunto(s)
Proteínas , Proteínas/química , Funciones de Verosimilitud , Modelos Moleculares , Microscopía por Crioelectrón/métodos , Cristalografía por Rayos X , Conformación Proteica
17.
Int J Pharm X ; 5: 100164, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-36798832

RESUMEN

Amorphous solid dispersion (ASD) is one of the most important strategies to improve the solubility and dissolution rate of poorly water-soluble drugs. As a widely used technique to prepare ASDs, hot-melt extrusion (HME) provides various benefits, including a solvent-free process, continuous manufacturing, and efficient mixing compared to solvent-based methods, such as spray drying. Energy input, consisting of thermal and specific mechanical energy, should be carefully controlled during the HME process to prevent chemical degradation and residual crystallinity. However, a conventional ASD development process uses a trial-and-error approach, which is laborious and time-consuming. In this study, we have successfully built multiple machine learning (ML) models to predict the amorphization of crystalline drug formulations and the chemical stability of subsequent ASDs prepared by the HME process. We utilized 760 formulations containing 49 active pharmaceutical ingredients (APIs) and multiple types of excipients. By evaluating the built ML models, we found that ECFP-LightGBM was the best model to predict amorphization with an accuracy of 92.8%. Furthermore, ECFP-XGBoost was the best in estimating chemical stability with an accuracy of 96.0%. In addition, the feature importance analyses based on SHapley Additive exPlanations (SHAP) and information gain (IG) revealed that several processing parameters and material attributes (i.e., drug loading, polymer ratio, drug's Extended-connectivity fingerprints (ECFP) fingerprints, and polymer's properties) are critical for achieving accurate predictions for the selected models. Moreover, important API's substructures related to amorphization and chemical stability were determined, and the results are largely consistent with the literature. In conclusion, we established the ML models to predict formation of chemically stable ASDs and identify the critical attributes during HME processing. Importantly, the developed ML methodology has the potential to facilitate the product development of ASDs manufactured by HME with a much reduced human workload.

18.
Artif Intell Med ; 135: 102450, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36628781

RESUMEN

Randomized controlled trials (RCTs) offer a clear causal interpretation of treatment effects, but are inefficient in terms of information gain per patient. Moreover, because they are intended to test cohort-level effects, RCTs rarely provide information to support precision medicine, which strives to choose the best treatment for an individual patient. If causal information could be efficiently extracted from widely available real-world data, the rapidity of treatment validation could be increased, and its costs reduced. Moreover, inferences could be made across larger, more diverse patient populations. We created a "virtual trial" by fitting a multilevel Bayesian survival model to treatment and outcome records self-reported by 451 brain cancer patients. The model recovers group-level treatment effects comparable to RCTs representing over 3200 patients. The model additionally discovers the feature-treatment interactions needed to make individual-level predictions for precision medicine. By learning from heterogeneous real-world data, virtual trials can generate more causal estimates with fewer patients than RCTs, and they can do so without artificially limiting the patient population. This demonstrates the value of virtual trials as a complement to large randomized controlled trials, especially in highly heterogeneous or rare diseases.


Asunto(s)
Neoplasias , Humanos , Neoplasias/tratamiento farmacológico
19.
Brief Bioinform ; 25(1)2023 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-38189541

RESUMEN

There generally exists a critical state or tipping point from a stable state to another in the development of colorectal cancer (CRC) beyond which a significant qualitative transition occurs. Gut microbiome sequencing data can be collected non-invasively from fecal samples, making it more convenient to obtain. Furthermore, intestinal microbiome sequencing data contain phylogenetic information at various levels, which can be used to reliably identify critical states, thereby providing early warning signals more accurately and effectively. Yet, pinpointing the critical states using gut microbiome data presents a formidable challenge due to the high dimension and strong noise of gut microbiome data. To address this challenge, we introduce a novel approach termed the specific network information gain (SNIG) method to detect CRC's critical states at various taxonomic levels via gut microbiome data. The numerical simulation indicates that the SNIG method is robust under different noise levels and that it is also superior to the existing methods on detecting the critical states. Moreover, utilizing SNIG on two real CRC datasets enabled us to discern the critical states preceding deterioration and to successfully identify their associated dynamic network biomarkers at different taxonomic levels. Notably, we discovered certain 'dark species' and pathways intimately linked to CRC progression. In addition, we accurately detected the tipping points on an individual dataset of type I diabetes.


Asunto(s)
Neoplasias Colorrectales , Diabetes Mellitus Tipo 1 , Microbioma Gastrointestinal , Humanos , Filogenia , Simulación por Computador , Neoplasias Colorrectales/diagnóstico , Neoplasias Colorrectales/genética
20.
Diagnostics (Basel) ; 12(12)2022 Nov 30.
Artículo en Inglés | MEDLINE | ID: mdl-36553007

RESUMEN

Parkinson's disease (PD) currently affects approximately 10 million people worldwide. The detection of PD positive subjects is vital in terms of disease prognostics, diagnostics, management and treatment. Different types of early symptoms, such as speech impairment and changes in writing, are associated with Parkinson disease. To classify potential patients of PD, many researchers used machine learning algorithms in various datasets related to this disease. In our research, we study the dataset of the PD vocal impairment feature, which is an imbalanced dataset. We propose comparative performance evaluation using various decision tree ensemble methods, with or without oversampling techniques. In addition, we compare the performance of classifiers with different sizes of ensembles and various ratios of the minority class and the majority class with oversampling and undersampling. Finally, we combine feature selection with best-performing ensemble classifiers. The result shows that AdaBoost, random forest, and decision tree developed for the RUSBoost imbalanced dataset perform well in performance metrics such as precision, recall, F1-score, area under the receiver operating characteristic curve (AUROC) and the geometric mean. Further, feature selection methods, namely lasso and information gain, were used to screen the 10 best features using the best ensemble classifiers. AdaBoost with information gain feature selection method is the best performing ensemble method with an F1-score of 0.903.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA