Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 269
Filtrar
1.
Heliyon ; 10(17): e36754, 2024 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-39286174

RESUMEN

Corrosion is one of the key factors leading to material failure, which can occur in facilities and equipment closely related to people's lives, causing structural damage and thus affecting the safety of people's lives and property. To identify corrosion more effectively across multiple facilities and equipment, this paper utilizes a corrosion binary classification dataset containing various materials to develop a CNN classification model for better detection and distinction of material corrosion, using a methodological paradigm of transfer learning and fine-tuning. The proposed model implementation initially uses data augmentation to enhance the dataset and employs different sizes of EfficientNetV2 for training, evaluated using Confusion Matrix, ROC curve, and the values of Precision, Recall, and F1-score. To further enhance the testing results, this paper focuses on the impact of using the Global Average Pooling layer versus the Global Max Pooling layer, as well as the number of fine-tuning layers. The results show that the Global Average Pooling layer performs better, and EfficientNetV2B0 with a fine-tuning rate of 20 %, and EfficientNetV2S with a fine-tuning rate of 15 %, achieve the highest testing accuracy of 0.9176, an ROC-AUC value of 0.97, and Precision, Recall, and F1-Score values exceeding 0.9. These findings can be served as a reference for other corrosion classification models which uses EfficientNetV2.

2.
Adv Sci (Weinh) ; : e2406535, 2024 Sep 05.
Artículo en Inglés | MEDLINE | ID: mdl-39234947

RESUMEN

The extraction of lithium (Li) from complex brines presents significant challenges due to the interference of competing ions, particularly magnesium (Mg2⁺), which complicates the selective separation process. Herein, a strategy is introduced employing charge-lock enhanced 2D heterogeneous channels for the rapid and selective uptake of Li⁺. This approach integrates porous ZnFe2O4/ZnO nanosheets into Ag+-modulated sub-nanometer interlayer channels, forming channels optimized for Li⁺ extraction. The novelty lies in the charge-lock mechanism, which selectively captures Mg2⁺ ions, thereby facilitating the effective separation of Li from Mg. This mechanism is driven by a charge transfer during the formation of ZnFe2O4/ZnO, rendering O atoms in Fe-O bonds more negatively charged. These negative charges strongly interact with the high charge density of Mg2⁺ ions, enabling the charge-locking mechanism and the targeted capture of Mg2⁺. Optimization with Ag⁺ further improves interlayer spacing, increasing ion transport rates and addressing the swelling issue typical of 2D membranes. The resultant membrane showcases high water flux (44.37 L m⁻2 h⁻¹ bar⁻¹) and an impressive 99.8% rejection of Mg2⁺ in real brine conditions, achieving a Li⁺/Mg2⁺ selectivity of 59.3, surpassing existing brine separation membranes. Additionally, this membrane demonstrates superior cyclic stability, highlighting its high potential for industrial applications.

3.
BMC Med Imaging ; 24(1): 230, 2024 Sep 02.
Artículo en Inglés | MEDLINE | ID: mdl-39223507

RESUMEN

Breast cancer is a leading cause of mortality among women globally, necessitating precise classification of breast ultrasound images for early diagnosis and treatment. Traditional methods using CNN architectures such as VGG, ResNet, and DenseNet, though somewhat effective, often struggle with class imbalances and subtle texture variations, leading to reduced accuracy for minority classes such as malignant tumors. To address these issues, we propose a methodology that leverages EfficientNet-B7, a scalable CNN architecture, combined with advanced data augmentation techniques to enhance minority class representation and improve model robustness. Our approach involves fine-tuning EfficientNet-B7 on the BUSI dataset, implementing RandomHorizontalFlip, RandomRotation, and ColorJitter to balance the dataset and improve model robustness. The training process includes early stopping to prevent overfitting and optimize performance metrics. Additionally, we integrate Explainable AI (XAI) techniques, such as Grad-CAM, to enhance the interpretability and transparency of the model's predictions, providing visual and quantitative insights into the features and regions of ultrasound images influencing classification outcomes. Our model achieves a classification accuracy of 99.14%, significantly outperforming existing CNN-based approaches in breast ultrasound image classification. The incorporation of XAI techniques enhances our understanding of the model's decision-making process, thereby increasing its reliability and facilitating clinical adoption. This comprehensive framework offers a robust and interpretable tool for the early detection and diagnosis of breast cancer, advancing the capabilities of automated diagnostic systems and supporting clinical decision-making processes.


Asunto(s)
Neoplasias de la Mama , Ultrasonografía Mamaria , Humanos , Neoplasias de la Mama/diagnóstico por imagen , Femenino , Ultrasonografía Mamaria/métodos , Interpretación de Imagen Asistida por Computador/métodos , Redes Neurales de la Computación , Inteligencia Artificial
4.
Sci China Life Sci ; 2024 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-39158766

RESUMEN

CRISPR-Cas12a genome engineering systems have been widely used in plant research and crop breeding. To date, the performance and use of anti-CRISPR-Cas12a systems have not been fully established in plants. Here, we conduct in silico analysis to identify putative anti-CRISPR systems for Cas12a. These putative anti-CRISPR proteins, along with known anti-CRISPR proteins, are assessed for their ability to inhibit Cas12a cleavage activity in vivo and in planta. Among all anti-CRISPR proteins tested, AcrVA1 shows robust inhibition of Mb2Cas12a and LbCas12a in E. coli. Further tests show that AcrVA1 inhibits LbCas12a mediated genome editing in rice protoplasts and stable transgenic lines. Impressively, co-expression of AcrVA1 mitigates off-target effects by CRISPR-LbCas12a, as revealed by whole genome sequencing. In addition, transgenic plants expressing AcrVA1 exhibit different levels of inhibition to LbCas12a mediated genome editing, representing a novel way of fine-tuning genome editing efficiency. By controlling temporal and spatial expression of AcrVA1, we show that inducible and tissue specific genome editing can be achieved in plants. Furthermore, we demonstrate that AcrVA1 also inhibits LbCas12a-based CRISPR activation (CRISPRa) and based on this principle we build logic gates to turn on and off target genes in plant cells. Together, we have established an efficient anti-CRISPR-Cas12a system in plants and demonstrate its versatile applications in mitigating off-target effects, fine-tuning genome editing efficiency, achieving spatial-temporal control of genome editing, and generating synthetic logic gates for controlling target gene expression in plant cells.

5.
J Transl Med ; 22(1): 756, 2024 Aug 12.
Artículo en Inglés | MEDLINE | ID: mdl-39135093

RESUMEN

BACKGROUND: Decoding human genomic sequences requires comprehensive analysis of DNA sequence functionality. Through computational and experimental approaches, researchers have studied the genotype-phenotype relationship and generate important datasets that help unravel complicated genetic blueprints. Thus, the recently developed artificial intelligence methods can be used to interpret the functions of those DNA sequences. METHODS: This study explores the use of deep learning, particularly pre-trained genomic models like DNA_bert_6 and human_gpt2-v1, in interpreting and representing human genome sequences. Initially, we meticulously constructed multiple datasets linking genotypes and phenotypes to fine-tune those models for precise DNA sequence classification. Additionally, we evaluate the influence of sequence length on classification results and analyze the impact of feature extraction in the hidden layers of our model using the HERV dataset. To enhance our understanding of phenotype-specific patterns recognized by the model, we perform enrichment, pathogenicity and conservation analyzes of specific motifs in the human endogenous retrovirus (HERV) sequence with high average local representation weight (ALRW) scores. RESULTS: We have constructed multiple genotype-phenotype datasets displaying commendable classification performance in comparison with random genomic sequences, particularly in the HERV dataset, which achieved binary and multi-classification accuracies and F1 values exceeding 0.935 and 0.888, respectively. Notably, the fine-tuning of the HERV dataset not only improved our ability to identify and distinguish diverse information types within DNA sequences but also successfully identified specific motifs associated with neurological disorders and cancers in regions with high ALRW scores. Subsequent analysis of these motifs shed light on the adaptive responses of species to environmental pressures and their co-evolution with pathogens. CONCLUSIONS: These findings highlight the potential of pre-trained genomic models in learning DNA sequence representations, particularly when utilizing the HERV dataset, and provide valuable insights for future research endeavors. This study represents an innovative strategy that combines pre-trained genomic model representations with classical methods for analyzing the functionality of genome sequences, thereby promoting cross-fertilization between genomics and artificial intelligence.


Asunto(s)
Genoma Humano , Genómica , Fenotipo , Humanos , Genómica/métodos , Modelos Genéticos , Retrovirus Endógenos/genética , Aprendizaje Profundo , Genotipo
6.
JAMIA Open ; 7(3): ooae075, 2024 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-39139700

RESUMEN

Objectives: Clinical note section identification helps locate relevant information and could be beneficial for downstream tasks such as named entity recognition. However, the traditional supervised methods suffer from transferability issues. This study proposes a new framework for using large language models (LLMs) for section identification to overcome the limitations. Materials and Methods: We framed section identification as question-answering and provided the section definitions in free-text. We evaluated multiple LLMs off-the-shelf without any training. We also fine-tune our LLMs to investigate how the size and the specificity of the fine-tuning dataset impacts model performance. Results: GPT4 achieved the highest F1 score of 0.77. The best open-source model (Tulu2-70b) achieved 0.64 and is on par with GPT3.5 (ChatGPT). GPT4 is also found to obtain F1 scores greater than 0.9 for 9 out of the 27 (33%) section types and greater than 0.8 for 15 out of 27 (56%) section types. For our fine-tuned models, we found they plateaued with an increasing size of the general domain dataset. We also found that adding a reasonable amount of section identification examples is beneficial. Discussion: These results indicate that GPT4 is nearly production-ready for section identification, and seemingly contains both knowledge of note structure and the ability to follow complex instructions, and the best current open-source LLM is catching up. Conclusion: Our study shows that LLMs are promising for generalizable clinical note section identification. They have the potential to be further improved by adding section identification examples to the fine-tuning dataset.

7.
Stud Health Technol Inform ; 316: 378-382, 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39176757

RESUMEN

Systematic review and meta-analysis constitute a staple of evidence-based medicine, an obligatory step in developing the guideline and recommendation document. It is a formalized process aiming at extracting and summarizing knowledge from the published work, grading, and considering the quality of the included studies. It is very laborious and time-consuming. Therefore, the meta-analyses are rarely updated and seldom living, decreasing their utility with time. Here, we present a framework for integrating the large language models and natural language processing techniques applied to the previously published systematic review and meta-analysis of the diagnostic test accuracy of the point of care tests. We show that the framework can be used to automate the screening step of the existing meta-analyses with minimal costs to quality and, to a large extent, the extraction step while maintaining the strict nature of the systematic review process.


Asunto(s)
Metaanálisis como Asunto , Procesamiento de Lenguaje Natural , Revisiones Sistemáticas como Asunto , Humanos , Medicina Basada en la Evidencia
8.
Stud Health Technol Inform ; 316: 650-651, 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39176825

RESUMEN

This study introduces a novel approach for generating machine-generated instruction datasets for fine-tuning medical-specialized language models using MIMIC-IV discharge records. The study created a large-scale text dataset comprising instructions, cropped discharge notes as inputs, and outputs in JSONL format. The dataset was generated through three main stages, generating instruction and output using seed tasks provided by medical experts, followed by invalid data filtering. The generated dataset consisted of 51,385 sets, with mean ROUGE between seed tasks of 0.185. Evaluation of the generated dataset were promising, with high validity rates determined by both GPT-3.5 and a human annotator (88.0% and 88.5% respectively). The study highlights the potential of automating dataset creation for NLP tasks in the medical domain.


Asunto(s)
Registros Electrónicos de Salud , Procesamiento de Lenguaje Natural , Humanos , Alta del Paciente , Resumen del Alta del Paciente
9.
Sci Rep ; 14(1): 19425, 2024 08 21.
Artículo en Inglés | MEDLINE | ID: mdl-39169054

RESUMEN

This paper introduces the efficient medical-images-aimed segment anything model (EMedSAM), addressing the high computational demands and limited adaptability of using SAM for medical image segmentation tasks. We present a novel, compact image encoder, DD-TinyViT, designed to enhance segmentation efficiency through an innovative parameter tuning method called med-adapter. The lightweight DD-TinyViT encoder is derived from the well-known ViT-H using a decoupled distillation approach.The segmentation and recognition capabilities of EMedSAM for specific structures are improved by med-adapter, which dynamically adjusts the model parameters specifically for medical imaging. We conducted extensive testing on EMedSAM using the public FLARE 2022 dataset and datasets from the First Hospital of Zhejiang University School of Medicine. The results demonstrate that our model outperforms existing state-of-the-art models in both multi-organ and lung segmentation tasks.


Asunto(s)
Procesamiento de Imagen Asistido por Computador , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Algoritmos , Pulmón/diagnóstico por imagen , Diagnóstico por Imagen/métodos , Tomografía Computarizada por Rayos X/métodos , Modelos Teóricos
10.
J Chem Inf Model ; 64(16): 6259-6280, 2024 Aug 26.
Artículo en Inglés | MEDLINE | ID: mdl-39136669

RESUMEN

Molecular Property Prediction (MPP) is vital for drug discovery, crop protection, and environmental science. Over the last decades, diverse computational techniques have been developed, from using simple physical and chemical properties and molecular fingerprints in statistical models and classical machine learning to advanced deep learning approaches. In this review, we aim to distill insights from current research on employing transformer models for MPP. We analyze the currently available models and explore key questions that arise when training and fine-tuning a transformer model for MPP. These questions encompass the choice and scale of the pretraining data, optimal architecture selections, and promising pretraining objectives. Our analysis highlights areas not yet covered in current research, inviting further exploration to enhance the field's understanding. Additionally, we address the challenges in comparing different models, emphasizing the need for standardized data splitting and robust statistical analysis.


Asunto(s)
Aprendizaje Automático , Descubrimiento de Drogas/métodos , Aprendizaje Profundo
11.
Sci Rep ; 14(1): 19534, 2024 08 22.
Artículo en Inglés | MEDLINE | ID: mdl-39174564

RESUMEN

Optimizers are the bottleneck of the training process of any Convolutionolution neural networks (CNN) model. One of the critical steps when work on CNN model is choosing the optimal optimizer to solve a specific problem. Recent challenge in nowadays researches is building new versions of traditional CNN optimizers that can work more efficient than the traditional optimizers. Therefore, this work proposes a novel enhanced version of Adagrad optimizer called SAdagrad that avoids the drawbacks of Adagrad optimizer in dealing with tuning the learning rate value for each step of the training process. In order to evaluate SAdagrad, this paper builds a CNN model that combines a fine- tuning technique and a weight decay technique together. It trains the proposed CNN model on Kather colorectal cancer histology dataset which is one of the most challenging datasets in recent researches of Diagnose of Colorectal Cancer (CRC). In fact, recently, there have been plenty of deep learning models achieving successful results with regard to CRC classification experiments. However, the enhancement of these models remains challenging. To train our proposed model, a learning transfer process, which is adopted from a pre-complicated defined model is applied to the proposed model and combined it with a regularization technique that helps in avoiding overfitting. The experimental results show that SAdagrad reaches a remarkable accuracy (98%), when compared with Adaptive momentum optimizer (Adam) and Adagrad optimizer. The experiments also reveal that the proposed model has a more stable training and testing processes, can reduce the overfitting problem in multiple epochs and can achieve a higher accuracy compared with previous researches on Diagnosis CRC using the same Kather colorectal cancer histology dataset.


Asunto(s)
Neoplasias Colorrectales , Aprendizaje Profundo , Redes Neurales de la Computación , Neoplasias Colorrectales/diagnóstico , Neoplasias Colorrectales/patología , Humanos , Algoritmos
12.
Toxicology ; 508: 153933, 2024 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-39181527

RESUMEN

To underpin scientific evaluations of chemical risks, agencies such as the European Food Safety Authority (EFSA) heavily rely on the outcome of systematic reviews, which currently require extensive manual effort. One specific challenge constitutes the meaningful use of vast amounts of valuable data from new approach methodologies (NAMs) which are mostly reported in an unstructured way in the scientific literature. In the EFSA-initiated project 'AI4NAMS', the potential of large language models (LLMs) was explored. Models from the GPT family, where GPT refers to Generative Pre-trained Transformer, were used for searching, extracting, and integrating data from scientific publications for NAM-based risk assessment. A case study on bisphenol A (BPA), a substance of very high concern due to its adverse effects on human health, focused on the structured extraction of information on test systems measuring biologic activities of BPA. Fine-tuning of a GPT-3 model (Curie base model) for extraction tasks was tested and the performance of the fine-tuned model was compared to the performance of a ready-to-use model (text-davinci-002). To update findings from the AI4NAMS project and to check for technical progress, the fine-tuning exercise was repeated and a newer ready-to-use model (text-davinci-003) served as comparison. In both cases, the fine-tuned Curie model was found to be superior to the ready-to-use model. Performance improvement was also obvious between text-davinci-002 and the newer text-davinci-003. Our findings demonstrate how fine-tuning and the swift general technical development improve model performance and contribute to the growing number of investigations on the use of AI in scientific and regulatory tasks.


Asunto(s)
Inteligencia Artificial , Compuestos de Bencidrilo , Fenoles , Medición de Riesgo/métodos , Compuestos de Bencidrilo/toxicidad , Humanos , Fenoles/toxicidad , Minería de Datos/métodos
13.
Med Image Anal ; 98: 103324, 2024 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-39213939

RESUMEN

Despite that the segment anything model (SAM) achieved impressive results on general-purpose semantic segmentation with strong generalization ability on daily images, its demonstrated performance on medical image segmentation is less precise and unstable, especially when dealing with tumor segmentation tasks that involve objects of small sizes, irregular shapes, and low contrast. Notably, the original SAM architecture is designed for 2D natural images and, therefore would not be able to extract the 3D spatial information from volumetric medical data effectively. In this paper, we propose a novel adaptation method for transferring SAM from 2D to 3D for promptable medical image segmentation. Through a holistically designed scheme for architecture modification, we transfer the SAM to support volumetric inputs while retaining the majority of its pre-trained parameters for reuse. The fine-tuning process is conducted in a parameter-efficient manner, wherein most of the pre-trained parameters remain frozen, and only a few lightweight spatial adapters are introduced and tuned. Regardless of the domain gap between natural and medical data and the disparity in the spatial arrangement between 2D and 3D, the transformer trained on natural images can effectively capture the spatial patterns present in volumetric medical images with only lightweight adaptations. We conduct experiments on four open-source tumor segmentation datasets, and with a single click prompt, our model can outperform domain state-of-the-art medical image segmentation models and interactive segmentation models. We also compared our adaptation method with existing popular adapters and observed significant performance improvement on most datasets. Our code and models are available at: https://github.com/med-air/3DSAM-adapter.


Asunto(s)
Imagenología Tridimensional , Humanos , Imagenología Tridimensional/métodos , Algoritmos , Neoplasias/diagnóstico por imagen
14.
Plant Methods ; 20(1): 124, 2024 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-39138512

RESUMEN

BACKGROUND: Chinese Cymbidium orchids, cherished for their deep-rooted cultural significance and significant economic value in China, have spawned a rich tapestry of cultivars. However, these orchid cultivars are facing challenges from insufficient cultivation practices and antiquated techniques, including cultivar misclassification, complex identification, and the proliferation of counterfeit products. Current commercial techniques and academic research primarily emphasize species identification of orchids, rather than delving into that of orchid cultivars within species. RESULTS: To bridge this gap, the authors dedicated over a year to collecting a cultivar image dataset for Chinese Cymbidium orchids named Orchid2024. This dataset contains over 150,000 images spanning 1,275 different categories, involving visits to 20 cities across 12 provincial administrative regions in China to gather pertinent data. Subsequently, we introduced various visual parameter-efficient fine-tuning (PEFT) methods to expedite model development, achieving the highest top-1 accuracy of 86.14% and top-5 accuracy of 95.44%. CONCLUSION: Experimental results demonstrate the complexity of the dataset while highlighting the considerable promise of PEFT methods within flower image classification. We believe that our work not only provides a practical tool for orchid researchers, growers and market participants, but also provides a unique and valuable resource for further exploring fine-grained image classification tasks. The dataset and code are available at https://github.com/pengyingshu/Orchid2024 .

15.
Front Cell Dev Biol ; 12: 1457209, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39170919

RESUMEN

Biological membranes consist of a lipid bilayer in which integral membrane proteins are embedded. Based on the compositional complexity of the lipid species found in membranes, and on their specific and selective interactions with membrane proteins, we recently suggested that membrane bilayers can be best described as "finely-tuned molecular machines." We now discuss one such set of lipid-protein interactions by describing a negative feedback mechanism operating in the de novo sphingolipid biosynthetic pathway, which occurs in the membrane of the endoplasmic reticulum, and describe the atomic interactions between the first enzyme in the pathway, namely serine palmitoyl transferase, and the product of the fourth enzyme in the pathway, ceramide. We explore how hydrogen-bonding and hydrophobic interactions formed between Asn13 and Phe63 in the serine palmitoyl transferase complex and ceramide can influence the ceramide content of the endoplasmic reticulum. This example of finely-tuned biochemical interactions raises intriguing mechanistic questions about how sphingolipids and their biosynthetic enzymes could have evolved, particularly in light of their metabolic co-dependence.

16.
Ann Occup Environ Med ; 36: e19, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39188666

RESUMEN

Background: Accurate occupation classification is essential in various fields, including policy development and epidemiological studies. This study aims to develop an occupation classification model based on DistilKoBERT. Methods: This study used data from the 5th and 6th Korean Working Conditions Surveys conducted in 2017 and 2020, respectively. A total of 99,665 survey participants, who were nationally representative of Korean workers, were included. We used natural language responses regarding their job responsibilities and occupational codes based on the Korean Standard Classification of Occupations (7th version, 3-digit codes). The dataset was randomly split into training and test datasets in a ratio of 7:3. The occupation classification model based on DistilKoBERT was fine-tuned using the training dataset, and the model was evaluated using the test dataset. The accuracy, precision, recall, and F1 score were calculated as evaluation metrics. Results: The final model, which classified 28,996 survey participants in the test dataset into 142 occupational codes, exhibited an accuracy of 84.44%. For the evaluation metrics, the precision, recall, and F1 score of the model, calculated by weighting based on the sample size, were 0.83, 0.84, and 0.83, respectively. The model demonstrated high precision in the classification of service and sales workers yet exhibited low precision in the classification of managers. In addition, it displayed high precision in classifying occupations prominently represented in the training dataset. Conclusions: This study developed an occupation classification system based on DistilKoBERT, which demonstrated reasonable performance. Despite further efforts to enhance the classification accuracy, this automated occupation classification model holds promise for advancing epidemiological studies in the fields of occupational safety and health.

17.
Diagnostics (Basel) ; 14(16)2024 Aug 07.
Artículo en Inglés | MEDLINE | ID: mdl-39202202

RESUMEN

Brain tumors are a leading cause of death globally, with numerous types varying in malignancy, and only 12% of adults diagnosed with brain cancer survive beyond five years. This research introduces a hyperparametric convolutional neural network (CNN) model to identify brain tumors, with significant practical implications. By fine-tuning the hyperparameters of the CNN model, we optimize feature extraction and systematically reduce model complexity, thereby enhancing the accuracy of brain tumor diagnosis. The critical hyperparameters include batch size, layer counts, learning rate, activation functions, pooling strategies, padding, and filter size. The hyperparameter-tuned CNN model was trained on three different brain MRI datasets available at Kaggle, producing outstanding performance scores, with an average value of 97% for accuracy, precision, recall, and F1-score. Our optimized model is effective, as demonstrated by our methodical comparisons with state-of-the-art approaches. Our hyperparameter modifications enhanced the model performance and strengthened its capacity for generalization, giving medical practitioners a more accurate and effective tool for making crucial judgments regarding brain tumor diagnosis. Our model is a significant step in the right direction toward trustworthy and accurate medical diagnosis, with practical implications for improving patient outcomes.

18.
Neural Netw ; 180: 106663, 2024 Aug 23.
Artículo en Inglés | MEDLINE | ID: mdl-39208459

RESUMEN

Utilizing large-scale pretrained models is a well-known strategy to enhance performance on various target tasks. It is typically achieved through fine-tuning pretrained models on target tasks. However, naï ve fine-tuning may not fully leverage knowledge embedded in pretrained models. In this study, we introduce a novel fine-tuning method, called stochastic cross-attention (StochCA), specific to Transformer architectures. This method modifies the Transformer's self-attention mechanism to selectively utilize knowledge from pretrained models during fine-tuning. Specifically, in each block, instead of self-attention, cross-attention is performed stochastically according to the predefined probability, where keys and values are extracted from the corresponding block of a pretrained model. By doing so, queries and channel-mixing multi-layer perceptron layers of a target model are fine-tuned to target tasks to learn how to effectively exploit rich representations of pretrained models. To verify the effectiveness of StochCA, extensive experiments are conducted on benchmarks in the areas of transfer learning and domain generalization, where the exploitation of pretrained models is critical. Our experimental results show the superiority of StochCA over state-of-the-art approaches in both areas. Furthermore, we demonstrate that StochCA is complementary to existing approaches, i.e., it can be combined with them to further improve performance. We release the code at https://github.com/daintlab/stochastic_cross_attention.

19.
Med Image Anal ; 97: 103258, 2024 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-38996667

RESUMEN

Foundation models pre-trained on large-scale data have been widely witnessed to achieve success in various natural imaging downstream tasks. Parameter-efficient fine-tuning (PEFT) methods aim to adapt foundation models to new domains by updating only a small portion of parameters in order to reduce computational overhead. However, the effectiveness of these PEFT methods, especially in cross-domain few-shot scenarios, e.g., medical image analysis, has not been fully explored. In this work, we facilitate the study of the performance of PEFT when adapting foundation models to medical image classification tasks. Furthermore, to alleviate the limitations of prompt introducing ways and approximation capabilities on Transformer architectures of mainstream prompt tuning methods, we propose the Embedded Prompt Tuning (EPT) method by embedding prompt tokens into the expanded channels. We also find that there are anomalies in the feature space distribution of foundation models during pre-training process, and prompt tuning can help mitigate this negative impact. To explain this phenomenon, we also introduce a novel perspective to understand prompt tuning: Prompt tuning is a distribution calibrator. And we support it by analysing patch-wise scaling and feature separation operations contained in EPT. Our experiments show that EPT outperforms several state-of-the-art fine-tuning methods by a significant margin on few-shot medical image classification tasks, and completes the fine-tuning process within highly competitive time, indicating EPT is an effective PEFT method. The source code is available at github.com/zuwenqiang/EPT.


Asunto(s)
Algoritmos , Humanos , Calibración , Procesamiento de Imagen Asistido por Computador/métodos
20.
Biomed Phys Eng Express ; 10(5)2024 Jul 30.
Artículo en Inglés | MEDLINE | ID: mdl-39029475

RESUMEN

Background.Glioblastoma Multiforme (GBM) is an aggressive form of malignant brain tumor with a generally poor prognosis.O6-methylguanine-DNA methyltransferase (MGMT) promoter methylation has been shown to be a predictive bio-marker for resistance to treatment of GBM, but it is invasive and time-consuming to determine methylation status. There has been effort to predict the MGMT methylation status through analyzing MRI scans using machine learning, which only requires pre-operative scans that are already part of standard-of-care for GBM patients.Purpose.To improve the performance of conventional transfer learning in the identification of MGMT promoter methylation status, we developed a 3D SpotTune network with adaptive fine-tuning capability. Using the pretrained weights of MedicalNet with the SpotTune network, we compared its performance with a randomly initialized network for different combinations of MR modalities.Methods.Using a ResNet50 as the base network, three categories of networks are created: (1) A 3D SpotTune network to process volumetric MR images, (2) a network with randomly initialized weights, and (3) a network pre-trained on MedicalNet. These three networks are trained and evaluated using a public GBM dataset provided by the University of Pennsylvania. The MRI scans from 240 patients are used, with 11 different modalities corresponding to a set of perfusion, diffusion, and structural scans. The performance is evaluated using 5-fold cross validation with a hold-out testing dataset.Results.The SpotTune network showed better performance than the randomly initialized network. The best performing SpotTune model achieved an area under the Receiver Operating Characteristic curve (AUC), average precision of the precision-recall curve (AP), sensitivity, and specificity values of 0.6604, 0.6179, 0.6667, and 0.6061 respectively.Conclusions.SpotTune enables transfer learning to be adaptive to individual patients, resulting in improved performance in predicting MGMT promoter methylation status in GBM using equivalent MRI modalities as compared to a randomly initialized network.


Asunto(s)
Neoplasias Encefálicas , Metilación de ADN , Metilasas de Modificación del ADN , Enzimas Reparadoras del ADN , Glioblastoma , Imagen por Resonancia Magnética , Regiones Promotoras Genéticas , Proteínas Supresoras de Tumor , Humanos , Glioblastoma/genética , Glioblastoma/diagnóstico por imagen , Imagen por Resonancia Magnética/métodos , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/diagnóstico por imagen , Metilasas de Modificación del ADN/genética , Metilasas de Modificación del ADN/metabolismo , Proteínas Supresoras de Tumor/genética , Proteínas Supresoras de Tumor/metabolismo , Enzimas Reparadoras del ADN/genética , Enzimas Reparadoras del ADN/metabolismo , Aprendizaje Automático , Curva ROC , Masculino , Femenino , Redes Neurales de la Computación , Adulto , Algoritmos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA