Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
J Imaging Inform Med ; 2024 Sep 19.
Artículo en Inglés | MEDLINE | ID: mdl-39299957

RESUMEN

Deep learning (DL) tools developed on adult data sets may not generalize well to pediatric patients, posing potential safety risks. We evaluated the performance of TotalSegmentator, a state-of-the-art adult-trained CT organ segmentation model, on a subset of organs in a pediatric CT dataset and explored optimization strategies to improve pediatric segmentation performance. TotalSegmentator was retrospectively evaluated on abdominal CT scans from an external adult dataset (n = 300) and an external pediatric data set (n = 359). Generalizability was quantified by comparing Dice scores between adult and pediatric external data sets using Mann-Whitney U tests. Two DL optimization approaches were then evaluated: (1) 3D nnU-Net model trained on only pediatric data, and (2) an adult nnU-Net model fine-tuned on the pediatric cases. Our results show TotalSegmentator had significantly lower overall mean Dice scores on pediatric vs. adult CT scans (0.73 vs. 0.81, P < .001) demonstrating limited generalizability to pediatric CT scans. Stratified by organ, there was lower mean pediatric Dice score for four organs (P < .001, all): right and left adrenal glands (right adrenal, 0.41 [0.39-0.43] vs. 0.69 [0.66-0.71]; left adrenal, 0.35 [0.32-0.37] vs. 0.68 [0.65-0.71]); duodenum (0.47 [0.45-0.49] vs. 0.67 [0.64-0.69]); and pancreas (0.73 [0.72-0.74] vs. 0.79 [0.77-0.81]). Performance on pediatric CT scans improved by developing pediatric-specific models and fine-tuning an adult-trained model on pediatric images where both methods significantly improved segmentation accuracy over TotalSegmentator for all organs, especially for smaller anatomical structures (e.g., > 0.2 higher mean Dice for adrenal glands; P < .001).

2.
Front Plant Sci ; 15: 1366395, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38774219

RESUMEN

This paper presents a robust deep learning method for fruit decay detection and plant identification. By addressing the limitations of previous studies that primarily focused on model accuracy, our approach aims to provide a more comprehensive solution that considers the challenges of robustness and limited data scenarios. The proposed method achieves exceptional accuracy of 99.93%, surpassing established models. In addition to its exceptional accuracy, the proposed method highlights the significance of robustness and adaptability in limited data scenarios. The proposed model exhibits strong performance even under the challenging conditions, such as intense lighting variations and partial image obstructions. Extensive evaluations demonstrate its robust performance, generalization ability, and minimal misclassifications. The inclusion of Class Activation Maps enhances the model's capability to identify distinguishing features between fresh and rotten fruits. This research has significant implications for fruit quality control, economic loss reduction, and applications in agriculture, transportation, and scientific research. The proposed method serves as a valuable resource for fruit and plant-related industries. It offers precise adaptation to specific data, customization of the network architecture, and effective training even with limited data. Overall, this research contributes to fruit quality control, economic loss reduction, and waste minimization.

3.
Int J Med Inform ; 186: 105397, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38507979

RESUMEN

BACKGROUND: Early prediction of acute respiratory distress syndrome (ARDS) of critically ill patients in intensive care units (ICUs) has been intensively studied in the past years. Yet a prediction model trained on data from one hospital might not be well generalized to other hospitals. It is therefore essential to develop an accurate and generalizable ARDS prediction model adaptive to different hospital or medical centers. METHODS: We analyzed electronic medical records of 200,859 and 50,920 hospitalized patients within 24 h after being diagnosed with ARDS from the Philips eICU Institute (eICU-CRD) and the Medical Information Mart for Intensive Care (MIMIC-IV) dataset, respectively. Patients were sorted into three groups, including rapid death, long stay, and recovery, based on their condition or outcome between 24 and 72 h after ARDS diagnosis. To improve prediction performance and generalizability, a "pretrain-finetune" approach was applied, where we pretrained models on the eICU-CRD dataset and performed model finetuning using only a part (35%) of the MIMIC-IV dataset, and then tested the finetuned models on the remaining data from the MIMIC-IV dataset. Well-known machine-learning algorithms, including logistic regression, random forest, extreme gradient boosting, and multilayer perceptron neural networks, were employed to predict ARDS outcomes. Prediction performance was evaluated using the area under the receiver-operating characteristic curve (AUC). RESULTS: Results show that, in general, multilayer perceptron neural networks outperformed the other models. The use of pretrain-finetune yielded improved performance in predicting ARDS outcomes achieving a micro-AUC of 0.870 for the MIMIC-IV dataset, an improvement of 0.046 over the pretrain model. CONCLUSIONS: The proposed pretrain-finetune approach can effectively improve model generalizability from one to another dataset in ARDS prediction.


Asunto(s)
Algoritmos , Síndrome de Dificultad Respiratoria , Humanos , Pronóstico , Cuidados Críticos , Registros Electrónicos de Salud , Síndrome de Dificultad Respiratoria/diagnóstico , Síndrome de Dificultad Respiratoria/terapia
4.
J Am Med Inform Assoc ; 31(5): 1051-1061, 2024 Apr 19.
Artículo en Inglés | MEDLINE | ID: mdl-38412331

RESUMEN

BACKGROUND: Predictive models show promise in healthcare, but their successful deployment is challenging due to limited generalizability. Current external validation often focuses on model performance with restricted feature use from the original training data, lacking insights into their suitability at external sites. Our study introduces an innovative methodology for evaluating features during both the development phase and the validation, focusing on creating and validating predictive models for post-surgery patient outcomes with improved generalizability. METHODS: Electronic health records (EHRs) from 4 countries (United States, United Kingdom, Finland, and Korea) were mapped to the OMOP Common Data Model (CDM), 2008-2019. Machine learning (ML) models were developed to predict post-surgery prolonged opioid use (POU) risks using data collected 6 months before surgery. Both local and cross-site feature selection methods were applied in the development and external validation datasets. Models were developed using Observational Health Data Sciences and Informatics (OHDSI) tools and validated on separate patient cohorts. RESULTS: Model development included 41 929 patients, 14.6% with POU. The external validation included 31 932 (UK), 23 100 (US), 7295 (Korea), and 3934 (Finland) patients with POU of 44.2%, 22.0%, 15.8%, and 21.8%, respectively. The top-performing model, Lasso logistic regression, achieved an area under the receiver operating characteristic curve (AUROC) of 0.75 during local validation and 0.69 (SD = 0.02) (averaged) in external validation. Models trained with cross-site feature selection significantly outperformed those using only features from the development site through external validation (P < .05). CONCLUSIONS: Using EHRs across four countries mapped to the OMOP CDM, we developed generalizable predictive models for POU. Our approach demonstrates the significant impact of cross-site feature selection in improving model performance, underscoring the importance of incorporating diverse feature sets from various clinical settings to enhance the generalizability and utility of predictive healthcare models.


Asunto(s)
Ciencia de los Datos , Informática Médica , Humanos , Modelos Logísticos , Reino Unido , Finlandia
5.
Ophthalmol Glaucoma ; 7(1): 8-15, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-37437884

RESUMEN

PURPOSE: To assess the performance and generalizability of a convolutional neural network (CNN) model for objective and high-throughput identification of primary angle-closure disease (PACD) as well as PACD stage differentiation on anterior segment swept-source OCT (AS-OCT). DESIGN: Cross-sectional. PARTICIPANTS: Patients from 3 different eye centers across China and Singapore were recruited for this study. Eight hundred forty-one eyes from the 2 Chinese centers were divided into 170 control eyes, 488 PACS, and 183 PAC + PACG eyes. An additional 300 eyes were recruited from Singapore National Eye Center as a testing data set, divided into 100 control eyes, 100 PACS, and 100 PAC + PACG eyes. METHODS: Each participant underwent standardized ophthalmic examination and was classified by the presiding physician as either control, primary angle-closure suspect (PACS), primary angle closure (PAC), or primary angle-closure glaucoma (PACG). Deep Learning model was used to train 3 different CNN classifiers: classifier 1 aimed to separate control versus PACS versus PAC + PACG; classifier 2 aimed to separate control versus PACD; and classifier 3 aimed to separate PACS versus PAC + PACG. All classifiers were evaluated on independent validation sets from the same region, China and further tested using data from a different country, Singapore. MAIN OUTCOME MEASURES: Area under receiver operator characteristic curve (AUC), precision, and recall. RESULTS: Classifier 1 achieved an AUC of 0.96 on validation set from the same region, but dropped to an AUC of 0.84 on test set from a different country. Classifier 2 achieved the most generalizable performance with an AUC of 0.96 on validation set and AUC of 0.95 on test set. Classifier 3 showed the poorest performance, with an AUC of 0.83 and 0.64 on test and validation data sets, respectively. CONCLUSIONS: Convolutional neural network classifiers can effectively distinguish PACD from controls on AS-OCT with good generalizability across different patient cohorts. However, their performance is moderate when trying to distinguish PACS versus PAC + PACG. FINANCIAL DISCLOSURES: The authors have no proprietary or commercial interest in any materials discussed in this article.


Asunto(s)
Aprendizaje Profundo , Glaucoma de Ángulo Cerrado , Humanos , Presión Intraocular , Tomografía de Coherencia Óptica/métodos , Estudios Transversales , Glaucoma de Ángulo Cerrado/diagnóstico
6.
AAPS PharmSciTech ; 24(8): 254, 2023 Dec 07.
Artículo en Inglés | MEDLINE | ID: mdl-38062329

RESUMEN

Data variations, library changes, and poorly tuned hyperparameters can cause failures in data-driven modelling. In such scenarios, model drift, a gradual shift in model performance, can lead to inaccurate predictions. Monitoring and mitigating drift are vital to maintain model effectiveness. USFDA and ICH regulate pharmaceutical variation with scientific risk-based approaches. In this study, the hyperparameter optimization for the Artificial Neural Network Multilayer Perceptron (ANN-MLP) was investigated using open-source data. The design of experiments (DoE) approach in combination with target drift prediction and statistical process control (SPC) was employed to achieve this objective. First, pre-screening and optimization DoEs were conducted on lab-scale data, serving as internal validation data, to identify the design space and control space. The regression performance metrics were carefully monitored to ensure the right set of hyperparameters was selected, optimizing the modelling time and storage requirements. Before extending the analysis to external validation data, a drift analysis on the target variable was performed. This aimed to determine if the external data fell within the studied range or required retraining of the model. Although a drift was observed, the external data remained well within the range of the internal validation data. Subsequently, trend analysis and process monitoring for the mean absolute error of the active content were conducted. The combined use of DoE, drift analysis, and SPC enabled trend analysis, ensuring that both current and external validation data met acceptance criteria. Out-of-specification and process control limits were determined, providing valuable insights into the model's performance and overall reliability. This comprehensive approach allowed for robust hyperparameter optimization and effective management of model lifecycle, crucial in achieving accurate and dependable predictions in various real-world applications.


Asunto(s)
Algoritmos , Espectroscopía Infrarroja Corta , Reproducibilidad de los Resultados , Redes Neurales de la Computación , Aprendizaje Automático
7.
Polymers (Basel) ; 15(19)2023 Sep 30.
Artículo en Inglés | MEDLINE | ID: mdl-37836011

RESUMEN

This study aims to develop a high-generalizability machine learning framework for predicting the homogenized mechanical properties of short fiber-reinforced polymer composites. The ensemble machine learning model (EML) employs a stacking algorithm using three base models of Extra Trees (ET), eXtreme Gradient Boosting machine (XGBoost), and Light Gradient Boosting machine (LGBM). A micromechanical model of a two-step homogenization algorithm is adopted and verified as an effective approach to composite modeling with randomly distributed fibers, which is integrated with finite element simulations for providing a high-quality ground-truth dataset. The model performance is thoroughly assessed for its accuracy, efficiency, interpretability, and generalizability. The results suggest that: (1) the EML model outperforms the base members on prediction accuracy, achieving R2 values of 0.988 and 0.952 on the train and test datasets, respectively; (2) the SHapley Additive exPlanations (SHAP) analysis identifies the Young's modulus of matrix, fiber, and fiber content as the top three factors influencing the homogenized properties, whereas the anisotropy is predominantly determined by the fiber orientations; (3) the EML model showcases good generalization capability on experimental data, and it has been shown to be more effective than high-fidelity computational models by significantly lowering computational costs while maintaining high accuracy.

8.
Comput Biol Med ; 159: 106901, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37068317

RESUMEN

BACKGROUND AND PURPOSE: A medical AI system's generalizability describes the continuity of its performance acquired from varying geographic, historical, and methodologic settings. Previous literature on this topic has mostly focused on "how" to achieve high generalizability (e.g., via larger datasets, transfer learning, data augmentation, model regularization schemes), with limited success. Instead, we aim to understand "when" the generalizability is achieved: Our study presents a medical AI system that could estimate its generalizability status for unseen data on-the-fly. MATERIALS AND METHODS: We introduce a latent space mapping (LSM) approach utilizing Fréchet distance loss to force the underlying training data distribution into a multivariate normal distribution. During the deployment, a given test data's LSM distribution is processed to detect its deviation from the forced distribution; hence, the AI system could predict its generalizability status for any previously unseen data set. If low model generalizability is detected, then the user is informed by a warning message integrated into a sample deployment workflow. While the approach is applicable for most classification deep neural networks (DNNs), we demonstrate its application to a brain metastases (BM) detector for T1-weighted contrast-enhanced (T1c) 3D MRI. The BM detection model was trained using 175 T1c studies acquired internally (from the authors' institution) and tested using (1) 42 internally acquired exams and (2) 72 externally acquired exams from the publicly distributed Brain Mets dataset provided by the Stanford University School of Medicine. Generalizability scores, false positive (FP) rates, and sensitivities of the BM detector were computed for the test datasets. RESULTS AND CONCLUSION: The model predicted its generalizability to be low for 31% of the testing data (i.e., two of the internally and 33 of the externally acquired exams), where it produced (1) ∼13.5 false positives (FPs) at 76.1% BM detection sensitivity for the low and (2) ∼10.5 FPs at 89.2% BM detection sensitivity for the high generalizability groups respectively. These results suggest that the proposed formulation enables a model to predict its generalizability for unseen data.


Asunto(s)
Neoplasias Encefálicas , Diagnóstico por Computador , Humanos , Diagnóstico por Computador/métodos , Imagen por Resonancia Magnética/métodos , Redes Neurales de la Computación , Neoplasias Encefálicas/diagnóstico por imagen , Neoplasias Encefálicas/secundario
9.
Front Physiol ; 14: 1084837, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36744032

RESUMEN

Photoplethysmography (PPG) signal is potentially suitable in atrial fibrillation (AF) detection for its convenience in use and similarity in physiological origin to electrocardiogram (ECG). There are a few preceding studies that have shown the possibility of using the peak-to-peak interval of the PPG signal (PPIp) in AF detection. However, as a generalized model, the accuracy of an AF detector should be pursued on the one hand; on the other hand, its generalizability should be paid attention to in view of the individual differences in PPG manifestation of even the same arrhythmia and the existence of sub-types. Moreover, a binary classifier for atrial fibrillation and normal sinus rhythm is not convincing enough for the similarity between AF and ectopic beats. In this study, we project the atrial fibrillation detection as a multiple-class classification and try to propose a training pipeline that is advantageous both to the accuracy and generalizability of the classifier by designing and determining the configurable options of the pipeline, in terms of input format, deep learning model (with hyperparameter optimization), and scheme of transfer learning. With a rigorous comparison of the possible combinations of the configurable components in the pipeline, we confirmed that first-order difference of heartbeat sequence as the input format, a 2-layer CNN-1-layer Transformer hybridR model as the learning model and the whole model fine-tuning as the implementing scheme of transfer learning is the best combination for the pipeline (F1 value: 0.80, overall accuracy: 0.87)R.

10.
BMC Med Inform Decis Mak ; 21(1): 224, 2021 07 24.
Artículo en Inglés | MEDLINE | ID: mdl-34303356

RESUMEN

BACKGROUND: Many models are published which predict outcomes in hospitalized COVID-19 patients. The generalizability of many is unknown. We evaluated the performance of selected models from the literature and our own models to predict outcomes in patients at our institution. METHODS: We searched the literature for models predicting outcomes in inpatients with COVID-19. We produced models of mortality or criticality (mortality or ICU admission) in a development cohort. We tested external models which provided sufficient information and our models using a test cohort of our most recent patients. The performance of models was compared using the area under the receiver operator curve (AUC). RESULTS: Our literature review yielded 41 papers. Of those, 8 were found to have sufficient documentation and concordance with features available in our cohort to implement in our test cohort. All models were from Chinese patients. One model predicted criticality and seven mortality. Tested against the test cohort, internal models had an AUC of 0.84 (0.74-0.94) for mortality and 0.83 (0.76-0.90) for criticality. The best external model had an AUC of 0.89 (0.82-0.96) using three variables, another an AUC of 0.84 (0.78-0.91) using ten variables. AUC's ranged from 0.68 to 0.89. On average, models tested were unable to produce predictions in 27% of patients due to missing lab data. CONCLUSION: Despite differences in pandemic timeline, race, and socio-cultural healthcare context some models derived in China performed well. For healthcare organizations considering implementation of an external model, concordance between the features used in the model and features available in their own patients may be important. Analysis of both local and external models should be done to help decide on what prediction method is used to provide clinical decision support to clinicians treating COVID-19 patients as well as what lab tests should be included in order sets.


Asunto(s)
COVID-19 , China , Hospitalización , Humanos , Pandemias , Estudios Retrospectivos , SARS-CoV-2
11.
JMIR Diabetes ; 6(2): e26909, 2021 Apr 29.
Artículo en Inglés | MEDLINE | ID: mdl-33913816

RESUMEN

BACKGROUND: Predictive alerts for impending hypoglycemic events enable persons with type 1 diabetes to take preventive actions and avoid serious consequences. OBJECTIVE: This study aimed to develop a prediction model for hypoglycemic events with a low false alert rate, high sensitivity and specificity, and good generalizability to new patients and time periods. METHODS: Performance improvement by focusing on sustained hypoglycemic events, defined as glucose values less than 70 mg/dL for at least 15 minutes, was explored. Two different modeling approaches were considered: (1) a classification-based method to directly predict sustained hypoglycemic events, and (2) a regression-based prediction of glucose at multiple time points in the prediction horizon and subsequent inference of sustained hypoglycemia. To address the generalizability and robustness of the model, two different validation mechanisms were considered: (1) patient-based validation (model performance was evaluated on new patients), and (2) time-based validation (model performance was evaluated on new time periods). RESULTS: This study utilized data from 110 patients over 30-90 days comprising 1.6 million continuous glucose monitoring values under normal living conditions. The model accurately predicted sustained events with >97% sensitivity and specificity for both 30- and 60-minute prediction horizons. The false alert rate was kept to <25%. The results were consistent across patient- and time-based validation strategies. CONCLUSIONS: Providing alerts focused on sustained events instead of all hypoglycemic events reduces the false alert rate and improves sensitivity and specificity. It also results in models that have better generalizability to new patients and time periods.

12.
Artif Intell Med ; 113: 102024, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33685587

RESUMEN

BACKGROUND AND OBJECTIVE: Clinical decision support assisted by prediction models usually faces the challenges of limited clinical data and a lack of labels when the model is developed with data from a single medical institution. Accordingly, research on multicenter clinical collaborative networks, which can provide external medical data, has received increasing attention. With the increasing availability of machine learning techniques such as transfer learning, leveraging large-scale patient data from multiple hospitals to build data-driven predictive models with clinical application potential provides an alternative solution to address the problem of limited patient data. METHODS: A multicenter hybrid semi-supervised transfer learning model (MHSTL) is proposed in this study on the basis of unified common data model to ensure multicenter data standardized representation. Then the hospital-specific features, along with the co-occurrence features across domains, are aligned through a representation learning architecture that is built based on deep neural networks and the newly proposed neural decision forest model. In this process, limited patient data from the target hospital, both labeled and unlabeled, are incorporated during the feature adaptation process, thereby contributing to better model performance. Without patient-level data sharing, the proposed model learning strategy which overcomes feature misalignment and distribution divergence, enables the multi-source transfer learning process in the case of insufficient and unlabeled patient data at target hospital. RESULTS: The effectiveness of the proposed transfer learning model was evaluated on a collaborative research network of colorectal cancer patients in the US and China. The results demonstrate that the proposed model can achieve much better performance for predicting target risk with limited resources on patient data than baseline models      . Better discrimination and calibration ability are also observed when sufficient labeled data are not available in the target hospital for prognosis prediction tasks      . Further exploratory experiments show that the proposed approach exhibits good model generalizability regardless of the data heterogeneity. With the help of the SHapley Additive exPlanations for model interpretation, the effectiveness of incorporating hospital-specific features in the transfer learning model is shown. CONCLUSIONS: In this study, the proposed method can develop prediction models from multiple source hospitals and exhibit good performance by leveraging cross-domain hospital-specific feature information, therefore enhancing the model prediction when applied to single medical institution with limited patient data.


Asunto(s)
Sistemas de Apoyo a Decisiones Clínicas , Redes Neurales de la Computación , Hospitales , Humanos , Aprendizaje Automático , Aprendizaje Automático Supervisado
13.
Ecosphere ; 8(10)2017 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-30237908

RESUMEN

The use of models by ecologists and environmental managers, to inform environmental management and decision-making, has grown exponentially in the past 50 years. Due to logistical, economical, and theoretical benefits, model users frequently transfer preexisting models to new sites where data are scarce. Modelers have made significant progress in understanding how to improve model generalizability during model development. However, models are always imperfect representations of systems and are constrained by the contextual frameworks used during their development. Thus, model users need better ways to evaluate the possibility of unintentional misapplication when transferring models to new sites. We propose a method of describing a model's application niche for use during the model selection process. Using this method, model users synthesize information from databases, past studies, and/or past model transfers to create model performance curves and heat maps. We demonstrated this method using an empirical model developed to predict the ecological condition of plant communities in riverine wetlands of the Appalachian Highland physiographic region, U.S.A. We assessed this model's transferability and generalizability across (1) riverine wetlands in the contiguous U.S.A., (2) wetland types in the Appalachian Highland physiographic region, and (3) wetland types in the contiguous U.S.A. With this methodology and a discussion of its critical steps, we set the stage for further inquiries into the development of consistent and transparent practices for model selection when transferring a model.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA