RESUMEN
INTRODUCTION: Interictal epileptiform discharges (IEDs) in electroencephalograms (EEGs) are an important biomarker for epilepsy. Currently, the gold standard for IED detection is the visual analysis performed by experts. However, this process is expert-biased, and time-consuming. Developing fast, accurate, and robust detection methods for IEDs based on EEG may facilitate epilepsy diagnosis. We aim to assess the performance of deep learning (DL) and classic machine learning (ML) algorithms in classifying EEG segments into IED and non-IED categories, as well as distinguishing whether the entire EEG contains IED or not. METHODS: We systematically searched PubMed, Embase, and Web of Science following PRISMA guidelines. We excluded studies that only performed the detection of IEDs instead of binary segment classification. Risk of Bias was evaluated with Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2). Meta-analysis with the overall area under the Summary Receiver Operating Characteristic (SROC), sensitivity, and specificity as effect measures, was performed with R software. RESULTS: A total of 23 studies, comprising 3,629 patients, were eligible for synthesis. Eighteen models performed discharge-level classification, and 6 whole-EEG classification. For the IED-level classification, 3 models were validated in an external dataset with more than 50 patients and achieved a sensitivity of 84.9 % (95 % CI: 82.3-87.2) and a specificity of 68.7 % (95 % CI: 7.9-98.2). Five studies reported model performance using both internal validation (cross-validation) and external datasets. The meta-analysis revealed higher performance for internal validation, with 90.4 % sensitivity and 99.6 % specificity, compared to external validation, which showed 78.1 % sensitivity and 80.1 % specificity. CONCLUSION: Meta-analysis showed higher performance for models validated with resampling methods compared to those using external datasets. Only a minority of models use more robust validation techniques, which often leads to overfitting.
RESUMEN
OBJECTIVES: To predict palatally impacted maxillary canines based on maxilla measurements through supervised machine learning techniques. MATERIALS AND METHODS: The maxilla images from 138 patients were analysed to investigate intermolar width, interpremolar width, interpterygoid width, maxillary length, maxillary width, nasal cavity width and nostril width, obtained through cone beam computed tomography scans. The predictive models were built using the following machine learning algorithms: Adaboost Classifier, Decision Tree, Gradient Boosting Classifier, K-Nearest Neighbours (KNN), Logistic Regression, Multilayer Perceptron Classifier (MLP), Random Forest Classifier and Support Vector Machine (SVM). A 5-fold cross-validation approach was employed to validate each model. Metrics such as area under the curve (AUC), accuracy, recall, precision and F1 Score were calculated for each model, and ROC curves were constructed. RESULTS: The predictive model included four variables (two dental and two skeletal measurements). The interpterygoid width and nostril width showed the largest effect sizes. The Gradient Boosting Classifier algorithm exhibited the best metrics, with AUC values ranging from 0.91 [CI95% = 0.74-0.98] for test data to 0.89 [CI95% = 0.86-0.94] for crossvalidation. The nostril width variable demonstrated the highest importance across all tested algorithms. CONCLUSION: The use of maxillary measurements, through supervised machine learning techniques, is a promising method for predicting palatally impacted maxillary canines. Among the models evaluated, both the Gradient Boosting Classifier and the Random Forest Classifier demonstrated the best performance metrics, with accuracy and AUC values exceeding 0.8, indicating strong predictive capability.
RESUMEN
Protamines play a critical role in DNA compaction and stabilization in sperm cells, significantly influencing male fertility and various biotechnological applications. Traditionally, identifying these proteins is a challenging and time-consuming process due to their species-specific variability and complexity. Leveraging advancements in computational biology, we present PROTA, a novel tool that combines machine learning (ML) and deep learning (DL) techniques to predict protamines with high accuracy. For the first time, we integrate Generative Adversarial Networks (GANs) with supervised learning methods to enhance the accuracy and generalizability of protamine prediction. Our methodology evaluated multiple ML models, including Light Gradient-Boosting Machine (LIGHTGBM), Multilayer Perceptron (MLP), Random Forest (RF), eXtreme Gradient Boosting (XGBOOST), k-Nearest Neighbors (KNN), Logistic Regression (LR), Naive Bayes (NB), and Radial Basis Function-Support Vector Machine (RBF-SVM). During ten-fold cross-validation on our training dataset, the MLP model with GAN-augmented data demonstrated superior performance metrics: 0.997 accuracy, 0.997 F1 score, 0.998 precision, 0.997 sensitivity, and 1.0 AUC. In the independent testing phase, this model achieved 0.999 accuracy, 0.999 F1 score, 1.0 precision, 0.999 sensitivity, and 1.0 AUC. These results establish PROTA, accessible via a user-friendly web application. We anticipate that PROTA will be a crucial resource for researchers, enabling the rapid and reliable prediction of protamines, thereby advancing our understanding of their roles in reproductive biology, biotechnology, and medicine.
Asunto(s)
Aprendizaje Profundo , Aprendizaje Automático , Protaminas , Protaminas/metabolismo , Biología Computacional/métodos , Máquina de Vectores de Soporte , Humanos , Programas InformáticosRESUMEN
Objective: To conduct a systematic review of external validation studies on the use of different Artificial Intelligence algorithms in breast cancer screening with mammography. Data source: Our systematic review was conducted and reported following the PRISMA statement, using the PubMed, EMBASE, and Cochrane databases with the search terms "Artificial Intelligence," "Mammography," and their respective MeSH terms. We filtered publications from the past ten years (2014 - 2024) and in English. Study selection: A total of 1,878 articles were found in the databases used in the research. After removing duplicates (373) and excluding those that did not address our PICO question (1,475), 30 studies were included in this work. Data collection: The data from the studies were collected independently by five authors, and it was subsequently synthesized based on sample data, location, year, and their main results in terms of AUC, sensitivity, and specificity. Data synthesis: It was demonstrated that the Area Under the ROC Curve (AUC) and sensitivity were similar to those of radiologists when using independent Artificial Intelligence. When used in conjunction with radiologists, statistically higher accuracy in mammogram evaluation was reported compared to the assessment by radiologists alone. Conclusion: AI algorithms have emerged as a means to complement and enhance the performance and accuracy of radiologists. They also assist less experienced professionals in detecting possible lesions. Furthermore, this tool can be used to complement and improve the analyses conducted by medical professionals.
Asunto(s)
Inteligencia Artificial , Neoplasias de la Mama , Mamografía , Mamografía/métodos , Humanos , Femenino , Neoplasias de la Mama/diagnóstico por imagen , Detección Precoz del Cáncer/métodos , Sensibilidad y Especificidad , Algoritmos , Estudios de Validación como AsuntoRESUMEN
Urban Heat Islands are a major environmental and public health concern, causing temperature increase in urban areas. This study used satellite imagery and machine learning to analyze the spatial and temporal patterns of land surface temperature distribution in the Metropolitan Area of Merida (MAM), Mexico, from 2001 to 2021. The results show that land surface temperature has increased in the MAM over the study period, while the urban footprint has expanded. The study also found a high correlation (r> 0.8) between changes in land surface temperature and land cover classes (urbanization/deforestation). If the current urbanization trend continues, the difference between the land surface temperature of the MAM and its surroundings is expected to reach 3.12 °C ± 1.11 °C by the year 2030. Hence, the findings of this study suggest that the Urban Heat Island effect is a growing problem in the MAM and highlight the importance of satellite imagery and machine learning for monitoring and developing mitigation strategies.
RESUMEN
The integration of machine learning (ML) with edge computing and wearable devices is rapidly advancing healthcare applications. This study systematically maps the literature in this emerging field, analyzing 171 studies and focusing on 28 key articles after rigorous selection. The research explores the key concepts, techniques, and architectures used in healthcare applications involving ML, edge computing, and wearable devices. The analysis reveals a significant increase in research over the past six years, particularly in the last three years, covering applications such as fall detection, cardiovascular monitoring, and disease prediction. The findings highlight a strong focus on neural network models, especially Convolutional Neural Networks (CNNs) and Long Short-Term Memory Networks (LSTMs), and diverse edge computing platforms like Raspberry Pi and smartphones. Despite the diversity in approaches, the field is still nascent, indicating considerable opportunities for future research. The study emphasizes the need for standardized architectures and the further exploration of both hardware and software to enhance the effectiveness of ML-driven healthcare solutions. The authors conclude by identifying potential research directions that could contribute to continued innovation in healthcare technologies.
Asunto(s)
Aprendizaje Automático , Redes Neurales de la Computación , Dispositivos Electrónicos Vestibles , Humanos , Atención a la Salud , Teléfono Inteligente , Monitoreo Fisiológico/instrumentación , Monitoreo Fisiológico/métodosRESUMEN
Muscle tone is defined as the resistance to passive stretch, but this definition is often criticized for its ambiguity since some suggest it is related to a state of preparation for movement. Muscle tone is primarily regulated by the central nervous system, and individuals with neurological disorders may lose the ability to control normal tone and can exhibit abnormalities. Currently, these abnormalities are mostly evaluated using subjective scales, highlighting a lack of objective assessment methods in the literature. This study aimed to use surface electromyography (sEMG) and machine learning (ML) for the objective classification and characterization of the full spectrum of muscle tone in the upper limb. Data were collected from thirty-nine individuals, including spastic, healthy, hypotonic and rigid subjects. All of the classifiers applied achieved high accuracy, with the best reaching 96.12%, in differentiating muscle tone. These results underscore the potential of the proposed methodology as a more reliable and quantitative method for evaluating muscle tone abnormalities, aiming to address the limitations of traditional subjective assessments. Additionally, the main features impacting the classifiers' performance were identified, which can be utilized in future research and in the development of devices that can be used in clinical practice.
Asunto(s)
Electromiografía , Aprendizaje Automático , Tono Muscular , Humanos , Electromiografía/métodos , Masculino , Adulto , Femenino , Tono Muscular/fisiología , Músculo Esquelético/fisiología , Adulto JovenRESUMEN
Objective: This study evaluates machine learning algorithms' effectiveness in classifying Parkinson's disease and Huntington's disease based on biomarker data obtained non-invasively from patients and healthy controls. Methods: Datasets containing biomarker data (x, y, and z values of accelerometers) from sensors were collected from Parkinson's disease, Huntington's disease patients, and healthy controls. An automatic selection model method was implemented for disease classification, using a unique Mexican database of human gait biomarkers, which we consider the only one of its kind. Random forest, random subspace method, and K-star algorithms were employed, with parameters optimized through an automated model selection. Results: The study achieved a 0.893 precision rate for Parkinson's disease and Huntington's disease using the random subspace method. The findings underscore the potential of machine learning techniques in medical diagnosis, particularly in neurological disorders. Conclusion: The automatic selection model method demonstrated efficacy in classifying Parkinson's disease and Huntington's disease based on non-invasive biomarker data. This research contributes to advancing non-invasive diagnostic approaches in neurological disorders, highlighting the significance of machine learning in healthcare.
RESUMEN
Background: Infections caused by antibiotic-resistant bacteria pose a major challenge to modern healthcare. This systematic review evaluates the efficacy of machine learning (ML) approaches in predicting antimicrobial resistance (AMR) in critical pathogens (CP), considering Whole Genome Sequencing (WGS) and antimicrobial susceptibility testing (AST). Methods: The search covered databases including PubMed/MEDLINE, EMBASE, Web of Science, SCOPUS, and SCIELO, from their inception until June 2024. The review protocol was officially registered on PROSPERO (CRD42024543099). Results: The review included 26 papers, analyzing data from 104,141 microbial samples. Random Forest (RF), XGBoost, and logistic regression (LR) emerged as the top-performing models, with mean Area Under the Receiver Operating Characteristic (AUC) values of 0.89, 0.87, and 0.87, respectively. RF showed superior performance with AUC values ranging from 0.66 to 0.97, while XGBoost and LR showed similar performance with AUC values ranging from 0.83 to 0.91 and 0.76 to 0.96, respectively. Most studies indicate that integrating WGS and AST data into ML models enhances predictive performance, improves antibiotic stewardship, and provides valuable clinical decision support. ML shows significant promise for predicting AMR by integrating WGS and AST data in CP. Standardized guidelines are needed to ensure consistency in future research.
Asunto(s)
Farmacorresistencia Bacteriana , Aprendizaje Automático , Pruebas de Sensibilidad Microbiana , Secuenciación Completa del Genoma , Humanos , Farmacorresistencia Bacteriana/genética , Antibacterianos/uso terapéutico , Antibacterianos/farmacología , Bacterias/efectos de los fármacos , Bacterias/genéticaRESUMEN
Urochloa grasses are widely used forages in the Neotropics and are gaining importance in other regions due to their role in meeting the increasing global demand for sustainable agricultural practices. High-throughput phenotyping (HTP) is important for accelerating Urochloa breeding programs focused on improving forage and seed yield. While RGB imaging has been used for HTP of vegetative traits, the assessment of phenological stages and seed yield using image analysis remains unexplored in this genus. This work presents a dataset of 2,400 high-resolution RGB images of 200 Urochloa hybrid genotypes, captured over seven months and covering both vegetative and reproductive stages. Images were manually labelled as vegetative or reproductive, and a subset of 255 reproductive stage images were annotated to identify 22,340 individual racemes. This dataset enables the development of machine learning and deep learning models for automated phenological stage classification and raceme identification, facilitating HTP and accelerated breeding of Urochloa spp. hybrids with high seed yield potential.
RESUMEN
This research evaluates the application of advanced machine learning algorithms, specifically Random Forest and Gradient Boosting, for the imputation of missing data in solar energy generation databases and their impact on the size of green hydrogen production systems. The study demonstrates that the Random Forest model notably excels in harnessing solar data to optimize hydrogen production, achieving superior prediction accuracy with mean absolute error (MAE) of 0.0364, mean squared error (MSE) of 0.0097, root mean squared error (RMSE) of 0.0985, and a coefficient of determination (R2) of 0.9779. These metrics surpass those obtained from baseline models including linear regression and recurrent neural networks, highlighting the potential of accurate imputation to significantly enhance the efficiency and output of renewable energy systems. The findings advocate for the integration of robust data imputation methods in the design and operation of photovoltaic systems, contributing to the reliability and sustainability of energy resource management. Furthermore, this research makes significant contributions by showcasing the comparative performance of traditional machine learning models in handling data gaps, emphasizing the practical implications of data imputation on optimizing hydrogen production systems. By providing a detailed analysis and validation of the imputation models, this work offers valuable insights for future advancements in renewable energy technology.
RESUMEN
INTRODUCTION AND OBJECTIVES: With rising prevalence of pre-sarcopenia in metabolic dysfunction-associated steatotic liver disease (MASLD), this study aimed to develop and validate machine learning-based model to identify pre-sarcopenia in MASLD population. MATERIALS AND METHODS: A total of 571 MASLD subjects were screened from the National Health and Nutrition Examination Survey 2017-2018. This cohort was randomly divided into training set and internal testing set with a ratio of 7:3. Sixty-six MASLD subjects were collected from our institution as external validation set. Four binary classifiers, including Random Forest (RF), support vector machine, and extreme gradient boosting and logistic regression, were fitted to identify pre-sarcopenia. The best-performing model was further validated in external validation set. Model performance was assessed in terms of discrimination and calibration. Shapley Additive explanations were used for model interpretability. RESULTS: The pre-sarcopenia rate was 17.51 % and 15.16 % in NHANES cohort and external validation set, respectively. RF outperformed other models with area under receiver operating characteristic curve (AUROC) of 0.819 (95 %CI: 0.749, 0.889). When six top-ranking features were retained as per variable importance, including weight-adjusted waist, sex, race, creatinine, education and alkaline phosphatase, a final RF model reached an AUROC being 0.824 (0.737, 0.910) and 0.732 (95 %CI: 0.529, 0.936) in internal and external validation sets, respectively. The model robustness was proved in sensitivity analysis. The calibration curve and decision curve analysis confirmed a good calibration capacity and good clinical usage. CONCLUSIONS: This study proposed a user-friendly model using explainable machine learning algorithm to predict pre-sarcopenia in MASLD population. A web-based tool was provided to screening pre-sarcopenia in community and hospitalization settings.
RESUMEN
Breast cancer is a highly heterogeneous disease characterized by different subtypes arising from molecular alterations that give the disease different phenotypes, clinical behaviors, and prognostic. The noncoding RNA (ncRNA)-derived micropeptides (MPs) represent a novel layer of complexity in cancer study once they can be biologically active and can present potential as biomarkers and also in therapeutics. However, few large-scale studies address the expression of these peptides at the peptidomics level or evaluate their functions and potential in peptide-based therapeutics for breast cancer. In this study, we propose deepening the landscape of ncRNA-derived MPs in breast cancer subtypes and advance the comprehension of the relevance of these molecules to the disease. First, we constructed a 16,349 unique putative MP sequence data set by integrating 2 previously published lists of predicted ncRNA-derived MPs. We evaluated its expression on high-throughput mass spectrometry data of breast tumor samples from different subtypes. Next, we applied several machine and deep learning tools, such as AntiCP 2.0, MULocDeep, PEPstrMOD, Peptipedia, and PreAIP, to predict its functions, cellular localization, tertiary structure, physicochemical features, and other properties related to therapeutics. We identified 58 peptides expressed on breast tissue, including 27 differentially expressed MPs in tumor compared with nontumor samples and MPs exhibiting tumor or subtype specificity. These peptides presented physicochemical features compatible with the canonical proteome and were predicted to influence the tumor immune environment and participate in cell communication, metabolism, and signaling processes. In addition, some MPs presented potential as anticancer, antiinflammatory, and antiangiogenic molecules. Our data demonstrate that MPs derived from ncRNAs have expression patterns associated with specific breast cancer subtypes and tumor specificity, thus highlighting their potential as biomarkers for molecular classification. We also reinforce the relevance of MPs as biologically active molecules that play a role in breast tumorigenesis, besides their potential in peptide-based therapeutics.
RESUMEN
OBJECTIVE: This study introduces the complete blood count (CBC), a standard prenatal screening test, as a biomarker for diagnosing preeclampsia with severe features (sPE), employing machine learning models. METHODS: We used a boosting machine learning model fed with synthetic data generated through a new methodology called DAS (Data Augmentation and Smoothing). Using data from a Brazilian study including 132 pregnant women, we generated 3,552 synthetic samples for model training. To improve interpretability, we also provided a ridge regression model. RESULTS: Our boosting model obtained an AUROC of 0.90±0.10, sensitivity of 0.95, and specificity of 0.79 to differentiate sPE and non-PE pregnant women, using CBC parameters of neutrophils count, mean corpuscular hemoglobin (MCH), and the aggregate index of systemic inflammation (AISI). In addition, we provided a ridge regression equation using the same three CBC parameters, which is fully interpretable and achieved an AUROC of 0.79±0.10 to differentiate the both groups. Moreover, we also showed that a monocyte count lower than 490 / m m 3 yielded a sensitivity of 0.71 and specificity of 0.72. CONCLUSION: Our study showed that ML-powered CBC could be used as a biomarker for sPE diagnosis support. In addition, we showed that a low monocyte count alone could be an indicator of sPE. SIGNIFICANCE: Although preeclampsia has been extensively studied, no laboratory biomarker with favorable cost-effectiveness has been proposed. Using artificial intelligence, we proposed to use the CBC, a low-cost, fast, and well-spread blood test, as a biomarker for sPE.
Asunto(s)
Biomarcadores , Aprendizaje Automático , Preeclampsia , Humanos , Preeclampsia/diagnóstico , Preeclampsia/sangre , Femenino , Embarazo , Biomarcadores/sangre , Recuento de Células Sanguíneas/métodos , Adulto , Sensibilidad y Especificidad , Brasil , Índice de Severidad de la Enfermedad , Curva ROC , Diagnóstico Prenatal/métodosRESUMEN
BACKGROUND: Battling malaria's morbidity and mortality rates demands innovative methods related to malaria diagnosis. Thick blood smears (TBS) are the gold standard for diagnosing malaria, but their coloration quality is dependent on supplies and adherence to standard protocols. Machine learning has been proposed to automate diagnosis, but the impact of smear coloration on parasite detection has not yet been fully explored. METHODS: To develop Coloration Analysis in Malaria (CAM), an image database containing 600 images was created. The database was randomly divided into training (70%), validation (15%), and test (15%) sets. Nineteen feature vectors were studied based on variances, correlation coefficients, and histograms (specific variables from histograms, full histograms, and principal components from the histograms). The Machine Learning Matlab Toolbox was used to select the best candidate feature vectors and machine learning classifiers. The candidate classifiers were then tuned for validation and tested to ultimately select the best one. RESULTS: This work introduces CAM, a machine learning system designed for automatic TBS image quality analysis. The results demonstrated that the cubic SVM classifier outperformed others in classifying coloration quality in TBS, achieving a true negative rate of 95% and a true positive rate of 97%. CONCLUSIONS: An image-based approach was developed to automatically evaluate the coloration quality of TBS. This finding highlights the potential of image-based analysis to assess TBS coloration quality. CAM is intended to function as a supportive tool for analyzing the coloration quality of thick blood smears.
Asunto(s)
Procesamiento de Imagen Asistido por Computador , Aprendizaje Automático , Procesamiento de Imagen Asistido por Computador/métodos , Humanos , Malaria , ColorRESUMEN
Neutrophils are the innate immune system's first line of defense, and their storage organelles are essential to their function. The storage organelles are divided into three different granule types named azurophilic, specific, and gelatinase granules, besides a fourth component called secretory vesicles. The isolation of neutrophil's granules is challenging, and the existing procedures rely on large sample volumes, about 400 mL of peripheral blood, precluding the use of multiple biological and technical replicates. Therefore, the aim of this study was to develop a miniaturized isolation of neutrophil granules (MING) method, using biochemical assays, mass spectrometry-based proteomics and a machine learning approach to investigate the protein content of these organelles. Neutrophils were isolated from 40 mL of blood collected from three apparently healthy volunteers and disrupted using nitrogen cavitation; the organelles were fractionated with a discontinuous 3-layer Percoll density gradient. The isolation was proven successful and allowed for a reasonable separation of neutrophil's storage organelles using a gradient approximately 37 times smaller than the methods described in the literature. Moreover, mass spectrometry-based proteomics identified 368 proteins in at least 3 of the 5 analyzed samples, and using a machine learning strategy aligned with markers from the literature, the localization of 50 proteins was predicted with confidence. When using markers determined within our dataset by a clusterization tool, the localization of 348 proteins was confidently determined. Importantly, this study was the first to investigate the proteome of neutrophil granules using technical and biological replicates, creating a reliable database for further studies.
RESUMEN
The COVID-19 pandemic, caused by SARS-CoV-2, has led to significant challenges worldwide, including diverse clinical outcomes and prolonged post-recovery symptoms known as Long COVID or Post-COVID-19 syndrome. Emerging evidence suggests a crucial role of metabolic reprogramming in the infection's long-term consequences. This study employs a novel approach utilizing machine learning (ML) and explainable artificial intelligence (XAI) to analyze metabolic alterations in COVID-19 and Post-COVID-19 patients. Samples were taken from a cohort of 142 COVID-19, 48 Post-COVID-19, and 38 control patients, comprising 111 identified metabolites. Traditional analysis methods, like PCA and PLS-DA, were compared with ML techniques, particularly eXtreme Gradient Boosting (XGBoost) enhanced by SHAP (SHapley Additive exPlanations) values for explainability. XGBoost, combined with SHAP, outperformed traditional methods, demonstrating superior predictive performance and providing new insights into the metabolic basis of the disease's progression and aftermath. The analysis revealed metabolomic subgroups within the COVID-19 and Post-COVID-19 conditions, suggesting heterogeneous metabolic responses to the infection and its long-term impacts. Key metabolic signatures in Post-COVID-19 include taurine, glutamine, alpha-Ketoglutaric acid, and LysoPC a C16:0. This study highlights the potential of integrating ML and XAI for a fine-grained description in metabolomics research, offering a more detailed understanding of metabolic anomalies in COVID-19 and Post-COVID-19 conditions.
RESUMEN
Artisanal gold mining can lead to soil contamination with potentially toxic elements (PTEs), necessitating soil quality monitoring due to environmental and human health risks. However, determining PTE levels through acid digestion is time-consuming, generates chemical waste, and requires significant resources. As an alternative, portable X-ray fluorescence (pXRF) offers a faster, more cost-effective, and sustainable analysis. This study compared total As, Ba, Cr, Cu, Fe, Mn, Ni, Pb, Sr, Ti, V, and Zn obtained from pXRF with their pseudo-total contents obtained through acid digestion (USEPA method 3051A) in areas influenced by artisanal gold mining in the Eastern Amazon, Brazil. pXRF data and machine learning algorithms were used to predict extractable Cu, Fe, Mn, and Zn. Linear regression models were fitted to compare the two methods, and random forest and support vector machine techniques were used to predict extractable contents. The best regression model fits for the pseudo-total PTE contents were those for Cu, Fe, Mn and Pb in agricultural areas (R2 > 0.80); Fe and Mn in gold mining residue (R2 > 0.70); and Ba, Cu and Mn in urban areas (R2 > 0.80). The best models for predicting the extractable PTE contents were those for Cu (R2 = 0.72; RMSE = 2.58 mg dm-3) and Zn (R2 = 0.71; RMSE = 1.44 mg dm-3) in agricultural areas and for Zn (R2 = 0.72; RMSE = 0.43 mg dm-3) in gold mining residue. The results demonstrated that pXRF can characterize and predict PTE contents in mining-impacted areas, offering a sustainable approach to soil quality analysis.
Asunto(s)
Agricultura , Monitoreo del Ambiente , Oro , Minería , Contaminantes del Suelo , Brasil , Contaminantes del Suelo/análisis , Monitoreo del Ambiente/métodos , Suelo/química , Metales Pesados/análisis , Espectrometría por Rayos X , CiudadesRESUMEN
BACKGROUND: Artificial intelligence (AI) algorithms for the detection of retinoblastoma (RB) by fundus image analysis have been proposed as a potentially effective technique to facilitate diagnosis and screening programs. However, doubts remain about the accuracy of the technique, the best type of AI for this situation, and its feasibility for everyday use. Therefore, we performed a systematic review and meta-analysis to evaluate this issue. METHODS: Following PRISMA 2020 guidelines, a comprehensive search of MEDLINE, Embase, ClinicalTrials.gov and IEEEX databases identified 494 studies whose titles and abstracts were screened for eligibility. We included diagnostic studies that evaluated the accuracy of AI in identifying retinoblastoma based on fundus imaging. Univariate and bivariate analysis was performed using the random effects model. The study protocol was registered in PROSPERO under CRD42024499221. RESULTS: Six studies with 9902 fundus images were included, of which 5944 (60%) had confirmed RB. Only one dataset used a semi-supervised machine learning (ML) based method, all other studies used supervised ML, three using architectures requiring high computational power and two using more economical models. The pooled analysis of all models showed a sensitivity of 98.2% (95% CI: 0.947-0.994), a specificity of 98.5% (95% CI: 0.916-0.998) and an AUC of 0.986 (95% CI: 0.970-0.989). Subgroup analyses comparing models with high and low computational power showed no significant difference (p=0.824). CONCLUSIONS: AI methods showed a high precision in the diagnosis of RB based on fundus images with no significant difference when comparing high and low computational power models, suggesting a viability of their use. Validation and cost-effectiveness studies are needed in different income countries. Subpopulations should also be analyzed, as AI may be useful as an initial screening tool in populations at high risk for RB, serving as a bridge to the pediatric ophthalmologist or ocular oncologist, who are scarce globally. KEY MESSAGES: What is known Retinoblastoma is the most common intraocular cancer in childhood and diagnostic delay is the main factor leading to a poor prognosis. The application of machine learning techniques proposes reliable methods for screening and diagnosis of retinal diseases. What is new The meta-analysis of the diagnostic accuracy of artificial intelligence methods for diagnosing retinoblastoma based on fundus images showed a sensitivity of 98.2% (95% CI: 0.947-0.994) and a specificity of 98.5% (95% CI: 0.916-0.998). There was no statistically significant difference in the diagnostic accuracy of high and low computational power models. The overall performance of supervised machine learning was best than unsupervised, although few studies were available on the second type.
RESUMEN
OBJECTIVE: The objective of this study was to analyze the incidence and overall survival (OS) of osteosarcoma (OSC) and Ewing's sarcoma (EWS) in a pediatric and adolescent population, employing machine learning (ML) and deep learning (DL) models to predict the likelihood of metastasis. METHODS: Involving 2465 OSC and 1373 EWS patients aged 0-19 years, from 2004 to 2020. ML techniques-Lasso, Ridge Regression, Elastic Net, and Random Forest-were used alongside a deep learning model based on TensorFlow and Keras, to construct predictive models for metastasis. These models were optimized using grid search with cross-validation and evaluated on their performance metrics, including AUC, sensitivity, and accuracy. The variables' importance in metastasis prediction was determined using SHAP values. Statistical analysis was performed using R software, and an online nomogram was developed for clinical use. RESULTS: The age-adjusted incidence of OSC and EWS from 2004 to 2020 showed a significant uptrend. The deep learning model, iterated 50 times, outperformed the Random Forest model in both loss and accuracy stabilization. The nomogram created demonstrated accurate survival predictions, as evidenced by its calibration curves and the distinction between high and low-risk groups. CONCLUSION: The increasing trend in age-adjusted incidence of OSC and EWS highlights the need for continued research and improved therapeutic strategies in this domain. The study employed ML and DL models to predict distant metastasis in pediatric and adolescent patients with OSC and EWS, providing a valuable tool for prognosis. The online nomogram developed as a part of this research enhances the models' clinical utility, offering an accessible means for clinicians to predict survival outcomes effectively.