Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 54
Filtrar
Más filtros











Intervalo de año de publicación
1.
Neural Netw ; 180: 106680, 2024 Aug 31.
Artículo en Inglés | MEDLINE | ID: mdl-39243513

RESUMEN

Most existing log-driven anomaly detection methods assume that logs are static and unchanged, which is often impractical. To address this, we propose a log anomaly detection model called DualAttlog. This model includes word-level and sequence-level semantic encoding modules, as well as a context-aware dual attention module. Specifically, The word-level semantic encoding module utilizes a self-matching attention mechanism to explore the interactive properties between words in log sequences. By performing word embedding and semantic encoding, it captures the associations and evolution processes between words, extracting local-level semantic information. while The sequence-level semantic encoding module encoding the entire log sequence using a pre-trained model. This extracts global semantic information, capturing overall patterns and trends in the logs. The context-aware dual attention module integrates these two levels of encoding, utilizing contextual information to reduce redundancy and enhance detection accuracy. Experimental results show that the DualAttlog model achieves an F1-Score of over 95% on 7 public datasets. Impressively, it achieves an F1-Score of 82.35% on the Real-Industrial W dataset and 83.54% on the Real-Industrial Q dataset. It outperforms existing baseline techniques on 9 datasets, demonstrating its significant advantages.

2.
J Biomed Inform ; 157: 104715, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39197731

RESUMEN

Accurately predicting blood glucose levels is crucial in diabetes management to mitigate patients' risk of complications. However, blood glucose values exhibit instability, and existing prediction methods often struggle to capture their volatile nature, leading to inaccurate trend forecasts. To address these challenges, we propose a novel blood glucose level prediction model based on the Informer architecture: BGformer. Our model introduces a feature enhancement module and a microscale overlapping concerns mechanism. The feature enhancement module integrates periodic and trend feature extractors, enhancing the model's ability to capture relevant information from the data. By extending the feature extraction capacity of time series data, it provides richer feature representations for analysis. Meanwhile, the microscale overlapping concerns mechanism adopts a window-based strategy, computing attention scores only within specific windows. This approach reduces computational complexity while enhancing the model's capacity to capture local temporal dependencies. Furthermore, we introduce a dual attention enhancement module to augment the model's expressive capability. Through prediction experiments on blood glucose values from sixteen diabetic patients, our model outperformed eight benchmark models in terms of both MAE and RMSE metrics for future 60-minute and 90-minute predictions. Our proposed scheme significantly improves the model's dependency-capturing ability, resulting in more accurate blood glucose level predictions.


Asunto(s)
Glucemia , Humanos , Glucemia/análisis , Algoritmos , Diabetes Mellitus/sangre
3.
Accid Anal Prev ; 207: 107748, 2024 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-39159592

RESUMEN

Driving risk prediction emerges as a pivotal technology within the driving safety domain, facilitating the formulation of targeted driving intervention strategies to enhance driving safety. The driving safety undergoes continuous evolution in response to the complexities of the traffic environment, representing a dynamic and ongoing serialization process. The evolutionary trend of this sequence offers valuable information pertinent to driving safety research. However, existing research on driving risk prediction has primarily concentrated on forecasting a single index, such as the driving safety level or the extreme value within a specified future timeframe. This approach often neglects the intrinsic properties that characterize the temporal evolution of driving safety. Leveraging the high-D natural driving dataset, this study employs the multi-step time series forecasting methodology to predict the risk evolution sequence throughout the car-following process, elucidates the benefits of the multi-step time series forecasting approach, and contrasts the predictive efficacy on driving safety levels across various temporal windows. The empirical findings demonstrate that the time series prediction model proficiently captures essential dynamics such as risk evolution trends, amplitudes, and turning points. Consequently, it provides predictions that are significantly more robust and comprehensive than those obtained from a single risk index. The TsLeNet proposed in this study integrates a 2D convolutional network architecture with a dual attention mechanism, adeptly capturing and synthesizing multiple features across time steps. This integration significantly enhances the prediction precision at each temporal interval. Comparative analyses with other mainstream models reveal that TsLeNet achieves the best performance in terms of prediction accuracy and efficiency. Concurrently, this research undertakes a comprehensive analysis of the temporal distribution of errors, the impact pattern of features on risk sequence, and the applicability of interaction features among surrounding vehicles. The adoption of multi-step time series forecasting approach not only offers a novel perspective for analyzing and exploring driving safety, but also furnishes the design and development of targeted driving intervention systems.


Asunto(s)
Accidentes de Tránsito , Conducción de Automóvil , Predicción , Humanos , Conducción de Automóvil/estadística & datos numéricos , Predicción/métodos , Accidentes de Tránsito/prevención & control , Accidentes de Tránsito/estadística & datos numéricos , Medición de Riesgo/métodos , Factores de Tiempo , Automóviles
4.
J Imaging Inform Med ; 2024 Aug 06.
Artículo en Inglés | MEDLINE | ID: mdl-39105850

RESUMEN

Currently, deep learning is developing rapidly in the field of image segmentation, and medical image segmentation is one of the key applications in this field. Conventional CNN has achieved great success in general medical image segmentation tasks, but it has feature loss in the feature extraction part and lacks the ability to explicitly model remote dependencies, which makes it difficult to adapt to the task of human organ segmentation. Although methods containing attention mechanisms have made good progress in the field of semantic segmentation, most of the current attention mechanisms are limited to a single sample, while the number of samples of human organ images is large, ignoring the correlation between the samples is not conducive to image segmentation. In order to solve these problems, an internal and external dual-attention segmentation network (IEA-Net) is proposed in this paper, and the ICSwR (interleaved convolutional system with residual) module and the IEAM module are designed in this network. The ICSwR contains interleaved convolution and hopping connection, which are used for the initial extraction of the features in the encoder part. The IEAM module (internal and external dual-attention module) consists of the LGGW-SA (local-global Gaussian-weighted self-attention) module and the EA module, which are in a tandem structure. The LGGW-SA module focuses on learning local-global feature correlations within individual samples for efficient feature extraction. Meanwhile, the EA module is designed to capture inter-sample connections, addressing multi-sample complexities. Additionally, skip connections will be incorporated into each IEAM module within both the encoder and decoder to reduce feature loss. We tested our method on the Synapse multi-organ segmentation dataset and the ACDC cardiac segmentation dataset, and the experimental results show that the proposed method achieves better performance than other state-of-the-art methods.

5.
Skin Res Technol ; 30(8): e13783, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-39113617

RESUMEN

BACKGROUND: In recent years, the increasing prevalence of skin cancers, particularly malignant melanoma, has become a major concern for public health. The development of accurate automated segmentation techniques for skin lesions holds immense potential in alleviating the burden on medical professionals. It is of substantial clinical importance for the early identification and intervention of skin cancer. Nevertheless, the irregular shape, uneven color, and noise interference of the skin lesions have presented significant challenges to the precise segmentation. Therefore, it is crucial to develop a high-precision and intelligent skin lesion segmentation framework for clinical treatment. METHODS: A precision-driven segmentation model for skin cancer images is proposed based on the Transformer U-Net, called BiADATU-Net, which integrates the deformable attention Transformer and bidirectional attention blocks into the U-Net. The encoder part utilizes deformable attention Transformer with dual attention block, allowing adaptive learning of global and local features. The decoder part incorporates specifically tailored scSE attention modules within skip connection layers to capture image-specific context information for strong feature fusion. Additionally, deformable convolution is aggregated into two different attention blocks to learn irregular lesion features for high-precision prediction. RESULTS: A series of experiments are conducted on four skin cancer image datasets (i.e., ISIC2016, ISIC2017, ISIC2018, and PH2). The findings show that our model exhibits satisfactory segmentation performance, all achieving an accuracy rate of over 96%. CONCLUSION: Our experiment results validate the proposed BiADATU-Net achieves competitive performance supremacy compared to some state-of-the-art methods. It is potential and valuable in the field of skin lesion segmentation.


Asunto(s)
Melanoma , Neoplasias Cutáneas , Humanos , Neoplasias Cutáneas/diagnóstico por imagen , Neoplasias Cutáneas/patología , Melanoma/diagnóstico por imagen , Melanoma/patología , Algoritmos , Redes Neurales de la Computación , Procesamiento de Imagen Asistido por Computador/métodos , Interpretación de Imagen Asistida por Computador/métodos , Dermoscopía/métodos , Aprendizaje Profundo
6.
Bioengineering (Basel) ; 11(7)2024 Jul 20.
Artículo en Inglés | MEDLINE | ID: mdl-39061819

RESUMEN

The liver is a vital organ in the human body, and CT images can intuitively display its morphology. Physicians rely on liver CT images to observe its anatomical structure and areas of pathology, providing evidence for clinical diagnosis and treatment planning. To assist physicians in making accurate judgments, artificial intelligence techniques are adopted. Addressing the limitations of existing methods in liver CT image segmentation, such as weak contextual analysis and semantic information loss, we propose a novel Dual Attention-Based 3D U-Net liver segmentation algorithm on CT images. The innovations of our approach are summarized as follows: (1) We improve the 3D U-Net network by introducing residual connections to better capture multi-scale information and alleviate semantic information loss. (2) We propose the DA-Block encoder structure to enhance feature extraction capability. (3) We introduce the CBAM module into skip connections to optimize feature transmission in the encoder, reducing semantic gaps and achieving accurate liver segmentation. To validate the effectiveness of the algorithm, experiments were conducted on the LiTS dataset. The results showed that the Dice coefficient and HD95 index for liver images were 92.56% and 28.09 mm, respectively, representing an improvement of 0.84% and a reduction of 2.45 mm compared to 3D Res-UNet.

7.
Entropy (Basel) ; 26(6)2024 May 31.
Artículo en Inglés | MEDLINE | ID: mdl-38920489

RESUMEN

In most silent speech research, continuously observing tongue movements is crucial, thus requiring the use of ultrasound to extract tongue contours. Precisely and in real-time extracting ultrasonic tongue contours presents a major challenge. To tackle this challenge, the novel end-to-end lightweight network DAFT-Net is introduced for ultrasonic tongue contour extraction. Integrating the Convolutional Block Attention Module (CBAM) and Attention Gate (AG) module with entropy-based optimization strategies, DAFT-Net establishes a comprehensive attention mechanism with dual functionality. This innovative approach enhances feature representation by replacing traditional skip connection architecture, thus leveraging entropy and information-theoretic measures to ensure efficient and precise feature selection. Additionally, the U-Net's encoder and decoder layers have been streamlined to reduce computational demands. This process is further supported by information theory, thus guiding the reduction without compromising the network's ability to capture and utilize critical information. Ablation studies confirm the efficacy of the integrated attention module and its components. The comparative analysis of the NS, TGU, and TIMIT datasets shows that DAFT-Net efficiently extracts relevant features, and it significantly reduces extraction time. These findings demonstrate the practical advantages of applying entropy and information theory principles. This approach improves the performance of medical image segmentation networks, thus paving the way for real-world applications.

8.
Front Bioeng Biotechnol ; 12: 1398237, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38827037

RESUMEN

Accurate medical image segmentation is critical for disease quantification and treatment evaluation. While traditional U-Net architectures and their transformer-integrated variants excel in automated segmentation tasks. Existing models also struggle with parameter efficiency and computational complexity, often due to the extensive use of Transformers. However, they lack the ability to harness the image's intrinsic position and channel features. Research employing Dual Attention mechanisms of position and channel have not been specifically optimized for the high-detail demands of medical images. To address these issues, this study proposes a novel deep medical image segmentation framework, called DA-TransUNet, aiming to integrate the Transformer and dual attention block (DA-Block) into the traditional U-shaped architecture. Also, DA-TransUNet tailored for the high-detail requirements of medical images, optimizes the intermittent channels of Dual Attention (DA) and employs DA in each skip-connection to effectively filter out irrelevant information. This integration significantly enhances the model's capability to extract features, thereby improving the performance of medical image segmentation. DA-TransUNet is validated in medical image segmentation tasks, consistently outperforming state-of-the-art techniques across 5 datasets. In summary, DA-TransUNet has made significant strides in medical image segmentation, offering new insights into existing techniques. It strengthens model performance from the perspective of image features, thereby advancing the development of high-precision automated medical image diagnosis. The codes and parameters of our model will be publicly available at https://github.com/SUN-1024/DA-TransUnet.

9.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38851298

RESUMEN

Deletion is a crucial type of genomic structural variation and is associated with numerous genetic diseases. The advent of third-generation sequencing technology has facilitated the analysis of complex genomic structures and the elucidation of the mechanisms underlying phenotypic changes and disease onset due to genomic variants. Importantly, it has introduced innovative perspectives for deletion variants calling. Here we propose a method named Dual Attention Structural Variation (DASV) to analyze deletion structural variations in sequencing data. DASV converts gene alignment information into images and integrates them with genomic sequencing data through a dual attention mechanism. Subsequently, it employs a multi-scale network to precisely identify deletion regions. Compared with four widely used genome structural variation calling tools: cuteSV, SVIM, Sniffles and PBSV, the results demonstrate that DASV consistently achieves a balance between precision and recall, enhancing the F1 score across various datasets. The source code is available at https://github.com/deconvolution-w/DASV.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos , Humanos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Eliminación de Secuencia , Análisis de Secuencia de ADN/métodos , Algoritmos , Genómica/métodos , Biología Computacional/métodos
10.
Acad Radiol ; 2024 May 27.
Artículo en Inglés | MEDLINE | ID: mdl-38806374

RESUMEN

RATIONALE AND OBJECTIVES: We examined the effectiveness of computed tomography (CT)-based deep learning (DL) models in differentiating benign and malignant solid pulmonary nodules (SPNs) ≤ 8 mm. MATERIALS AND METHODS: The study patients (n = 719) were divided into internal training, internal validation, and external validation cohorts; all had small SPNs and had undergone preoperative chest CTs and surgical resection. We developed five DL models incorporating features of the nodule and five different peri-nodular regions with the Multiscale Dual Attention Network (MDANet) to differentiate benign and malignant SPNs. We selected the best-performing model, which was then compared to four conventional algorithms (VGG19, ResNet50, ResNeXt50, and DenseNet121). Furthermore, another five DL models were constructed using MDANet to distinguish benign tumors from inflammatory nodules and the one performed best was selected out. RESULTS: Model 4, which incorporated the nodule and 15 mm peri-nodular region, best differentiated benign and malignant SPNs. The model had an area under the curve (AUC), accuracy, recall, precision, and F1-score of 0.730, 0.724, 0.711, 0.705, and 0.707 in the external validation cohort. Model 4 also performed better than the other four conventional algorithms. Model 8, which incorporated the nodule and 10 mm peri-nodular region, was the best model for distinguishing benign tumors from inflammatory nodules. The model had an AUC, accuracy, recall, precision, and F1-score of 0.871, 0.938, 0.863, 0.904, and 0.882 in the external validation cohort. CONCLUSION: The study concludes that CT-based DL models built with MDANet can accurately discriminate among small benign and malignant SPNs, benign tumors and inflammatory nodules.

11.
Sensors (Basel) ; 24(7)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38610367

RESUMEN

With the rapid development of smart manufacturing, data-driven deep learning (DL) methods are widely used for bearing fault diagnosis. Aiming at the problem of model training crashes when data are imbalanced and the difficulty of traditional signal analysis methods in effectively extracting fault features, this paper proposes an intelligent fault diagnosis method of rolling bearings based on Gramian Angular Difference Field (GADF) and Improved Dual Attention Residual Network (IDARN). The original vibration signals are encoded as 2D-GADF feature images for network input; the residual structures will incorporate dual attention mechanism to enhance the integration ability of the features, while the group normalization (GN) method is introduced to overcome the bias caused by data discrepancies; and then the model is trained to complete the classification of faults. In order to verify the superiority of the proposed method, the data obtained from Case Western Reserve University (CWRU) bearing data and bearing fault experimental equipment were compared with other popular DL methods, and the proposed model performed optimally. The method eventually achieved an average identification accuracy of 99.2% and 97.9% on two different types of datasets, respectively.

12.
Biomed Phys Eng Express ; 10(3)2024 Apr 26.
Artículo en Inglés | MEDLINE | ID: mdl-38588648

RESUMEN

Objective. Ultrasound-assisted orthopaedic navigation held promise due to its non-ionizing feature, portability, low cost, and real-time performance. To facilitate the applications, it was critical to have accurate and real-time bone surface segmentation. Nevertheless, the imaging artifacts and low signal-to-noise ratios in the tomographical B-mode ultrasound (B-US) images created substantial challenges in bone surface detection. In this study, we presented an end-to-end lightweight US bone segmentation network (UBS-Net) for bone surface detection.Approach. We presented an end-to-end lightweight UBS-Net for bone surface detection, using the U-Net structure as the base framework and a level set loss function for improved sensitivity to bone surface detectability. A dual attention (DA) mechanism was introduced at the end of the encoder, which considered both position and channel information to obtain the correlation between the position and channel dimensions of the feature map, where axial attention (AA) replaced the traditional self-attention (SA) mechanism in the position attention module for better computational efficiency. The position attention and channel attention (CA) were combined with a two-class fusion module for the DA map. The decoding module finally completed the bone surface detection.Main Results. As a result, a frame rate of 21 frames per second (fps) in detection were achieved. It outperformed the state-of-the-art method with higher segmentation accuracy (Dice similarity coefficient: 88.76% versus 87.22%) when applied the retrospective ultrasound (US) data from 11 volunteers.Significance. The proposed UBS-Net for bone surface detection in ultrasound achieved outstanding accuracy and real-time performance. The new method out-performed the state-of-the-art methods. It had potential in US-guided orthopaedic surgery applications.


Asunto(s)
Procesamiento de Imagen Asistido por Computador , Relación Señal-Ruido , Ultrasonografía , Humanos , Ultrasonografía/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Algoritmos , Huesos/diagnóstico por imagen , Redes Neurales de la Computación
13.
Comput Biol Med ; 174: 108146, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38608320

RESUMEN

Leukocytes, also called White Blood Cells (WBCs) or leucocytes, are the cells that play a pivotal role in human health and are vital indicators of diseases such as malaria, leukemia, AIDS, and other viral infections. WBCs detection and classification in blood smears offers insights to pathologists, aiding diagnosis across medical conditions. Traditional techniques, including manual counting, detection, classification, and visual inspection of microscopic images by medical professionals, pose challenges due to their labor-intensive nature. However, traditional methods are time consuming and sometimes susceptible to errors. Here, we propose a high-performance convolutional neural network (CNN) coupled with a dual-attention network that efficiently detects and classifies WBCs in microscopic thick smear images. The main aim of this study was to enhance clinical hematology systems and expedite medical diagnostic processes. In the proposed technique, we utilized a deep convolutional generative adversarial network (DCGAN) to overcome the limitations imposed by limited training data and employed a dual attention mechanism to improve accuracy, efficiency, and generalization. The proposed technique achieved overall accuracy rates of 99.83%, 99.35%, and 99.60% for the peripheral blood cell (PBC), leukocyte images for segmentation and classification (LISC), and Raabin-WBC benchmark datasets, respectively. Our proposed approach outperforms state-of-the-art methods in terms of accuracy, highlighting the effectiveness of the strategies employed and their potential to enhance diagnostic capabilities and advance real-world healthcare practices and diagnostic systems.


Asunto(s)
Leucocitos , Redes Neurales de la Computación , Humanos , Leucocitos/citología , Leucocitos/clasificación , Microscopía/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Aprendizaje Profundo
14.
Heliyon ; 10(5): e26775, 2024 Mar 15.
Artículo en Inglés | MEDLINE | ID: mdl-38439873

RESUMEN

Existing approaches to 3D medical image segmentation can be generally categorized into convolution-based or transformer-based methods. While convolutional neural networks (CNNs) demonstrate proficiency in extracting local features, they encounter challenges in capturing global representations. In contrast, the consecutive self-attention modules present in vision transformers excel at capturing long-range dependencies and achieving an expanded receptive field. In this paper, we propose a novel approach, termed SCANeXt, for 3D medical image segmentation. Our method combines the strengths of dual attention (Spatial and Channel Attention) and ConvNeXt to enhance representation learning for 3D medical images. In particular, we propose a novel self-attention mechanism crafted to encompass spatial and channel relationships throughout the entire feature dimension. To further extract multiscale features, we introduce a depth-wise convolution block inspired by ConvNeXt after the dual attention block. Extensive evaluations on three benchmark datasets, namely Synapse, BraTS, and ACDC, demonstrate the effectiveness of our proposed method in terms of accuracy. Our SCANeXt model achieves a state-of-the-art result with a Dice Similarity Score of 95.18% on the ACDC dataset, significantly outperforming current methods.

15.
Artif Intell Med ; 149: 102782, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38462283

RESUMEN

Diabetic retinopathy (DR) is the most prevalent cause of visual impairment in adults worldwide. Typically, patients with DR do not show symptoms until later stages, by which time it may be too late to receive effective treatment. DR Grading is challenging because of the small size and variation in lesion patterns. The key to fine-grained DR grading is to discover more discriminating elements such as cotton wool, hard exudates, hemorrhages, microaneurysms etc. Although deep learning models like convolutional neural networks (CNN) seem ideal for the automated detection of abnormalities in advanced clinical imaging, small-size lesions are very hard to distinguish by using traditional networks. This work proposes a bi-directional spatial and channel-wise parallel attention based network to learn discriminative features for diabetic retinopathy grading. The proposed attention block plugged with a backbone network helps to extract features specific to fine-grained DR-grading. This scheme boosts classification performance along with the detection of small-sized lesion parts. Extensive experiments are performed on four widely used benchmark datasets for DR grading, and performance is evaluated on different quality metrics. Also, for model interpretability, activation maps are generated using the LIME method to visualize the predicted lesion parts. In comparison with state-of-the-art methods, the proposed IDANet exhibits better performance for DR grading and lesion detection.


Asunto(s)
Diabetes Mellitus , Retinopatía Diabética , Adulto , Humanos , Retinopatía Diabética/diagnóstico por imagen , Retinopatía Diabética/patología , Redes Neurales de la Computación , Interpretación de Imagen Asistida por Computador/métodos
16.
Phys Med Biol ; 69(7)2024 Mar 14.
Artículo en Inglés | MEDLINE | ID: mdl-38412532

RESUMEN

Objective. Laparoscopic renal unit-preserving resection is a routine and effective means of treating renal tumors. Image segmentation is an essential part before tumor resection. The current segmentation method mainly relies on doctors manual delineation, which is time-consuming, labor-intensive, and influenced by their personal experience and ability. And the image quality of segmentation is low, with problems such as blurred edges, unclear size and shape, which are not conducive to clinical diagnosis.Approach. To address these problems, we propose an automated segmentation method, i.e. the UNet++ algorithm fusing multiscale residuals and dual attention (MRDA_UNet++). It replaces two consecutive 3 × 3 convolutions in UNet++ with the 'MultiRes block' module, which incorporates coordinate attention to fuse features from different scales and suppress the impact of background noise. Furthermore, an attention gate is also added at the short connections to enhance the ability of the network to extract features from the target area.Main results. The experimental results show that MRDA_UNet++ achieves 93.18%, 92.87%, 93.66%, and 92.09% on the real-world dataset for MIoU, Dice, Precision, and Recall, respectively. Compared to the baseline model UNet++ on three public datasets, the MIoU, Dice, and Recall metrics improved by 6.00%, 7.90% and 18.09% respectively for BUSI, 0.39%, 0.27% and 1.03% for Dataset C, and 1.37%, 1.75% and 1.30% for DDTI.Significance. The proposed MRDA_UNet++ exhibits obvious advantages in feature extraction, which can not only significantly reduce the workload of doctors, but also further decrease the risk of misdiagnosis. It is of great value to assist doctors diagnosis in the clinic.


Asunto(s)
Neoplasias Renales , Humanos , Neoplasias Renales/diagnóstico por imagen , Riñón , Ultrasonografía , Algoritmos , Benchmarking , Procesamiento de Imagen Asistido por Computador
17.
Comput Biol Med ; 168: 107742, 2024 01.
Artículo en Inglés | MEDLINE | ID: mdl-38000248

RESUMEN

Chest radiographs are the most commonly performed radiological examinations for lesion detection. Recent advances in deep learning have led to encouraging results in various thoracic disease detection tasks. Particularly, the architecture with feature pyramid network performs the ability to recognise targets with different sizes. However, such networks are difficult to focus on lesion regions in chest X-rays due to their high resemblance in vision. In this paper, we propose a dual attention supervised module for multi-label lesion detection in chest radiographs, named DualAttNet. It efficiently fuses global and local lesion classification information based on an image-level attention block and a fine-grained disease attention algorithm. A binary cross entropy loss function is used to calculate the difference between the attention map and ground truth at image level. The generated gradient flow is leveraged to refine pyramid representations and highlight lesion-related features. We evaluate the proposed model on VinDr-CXR, ChestX-ray8 and COVID-19 datasets. The experimental results show that DualAttNet surpasses baselines by 0.6% to 2.7% mAP and 1.4% to 4.7% AP50 with different detection architectures. The code for our work and more technical details can be found at https://github.com/xq141839/DualAttNet.


Asunto(s)
Algoritmos , COVID-19 , Humanos , Rayos X , COVID-19/diagnóstico por imagen , Entropía , Radiografía
18.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 40(5): 920-927, 2023 Oct 25.
Artículo en Chino | MEDLINE | ID: mdl-37879921

RESUMEN

Glaucoma is one of blind causing diseases. The cup-to-disc ratio is the main basis for glaucoma screening. Therefore, it is of great significance to precisely segment the optic cup and disc. In this article, an optic cup and disc segmentation model based on the linear attention and dual attention is proposed. Firstly, the region of interest is located and cropped according to the characteristics of the optic disc. Secondly, linear attention residual network-34 (ResNet-34) is introduced as a feature extraction network. Finally, channel and spatial dual attention weights are generated by the linear attention output features, which are used to calibrate feature map in the decoder to obtain the optic cup and disc segmentation image. Experimental results show that the intersection over union of the optic disc and cup in Retinal Image Dataset for Optic Nerve Head Segmentation (DRISHTI-GS) dataset are 0.962 3 and 0.856 4, respectively, and the intersection over union of the optic disc and cup in retinal image database for optic nerve evaluation (RIM-ONE-V3) are 0.956 3 and 0.784 4, respectively. The proposed model is better than the comparison algorithm and has certain medical value in the early screening of glaucoma. In addition, this article uses knowledge distillation technology to generate two smaller models, which is beneficial to apply the models to embedded device.


Asunto(s)
Glaucoma , Disco Óptico , Humanos , Disco Óptico/diagnóstico por imagen , Glaucoma/diagnóstico , Algoritmos , Técnicas de Diagnóstico Oftalmológico , Bases de Datos Factuales
19.
Neural Netw ; 167: 433-444, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37673029

RESUMEN

Synthesizing realistic fine-grained images from text descriptions is a significant computer vision task. Although many GANs-based methods have been proposed to solve this task, generating high-quality images consistent with text information remains a difficult problem. These existing GANs-based methods ignore important words due to the use of fixed initial word features in generator, and neglect to learn semantic consistency between images and texts for discriminators. In this article, we propose a novel attentional generation and contrastive adversarial framework for fine-grained text-to-image synthesis, termed as Word Self-Update Contrastive Adversarial Networks (WSC-GAN). Specifically, we introduce a dual attention module for modeling color details and semantic information. With a new designed word self-update module, the generator can leverage visually important words to compute attention maps in the feature synthesis module. Furthermore, we contrive multi-branch contrastive discriminators to maintain better consistency between the generated image and text description. Two novel contrastive losses are proposed for our discriminators to impose image-sentence and image-word consistency constraints. Extensive experiments on CUB and MS-COCO datasets demonstrate that our method achieves better performance compared with state-of-the-art methods.


Asunto(s)
Aprendizaje , Semántica , Procesamiento de Imagen Asistido por Computador
20.
Sensors (Basel) ; 23(17)2023 Aug 23.
Artículo en Inglés | MEDLINE | ID: mdl-37687809

RESUMEN

Road scene understanding, as a field of research, has attracted increasing attention in recent years. The development of road scene understanding capabilities that are applicable to real-world road scenarios has seen numerous complications. This has largely been due to the cost and complexity of achieving human-level scene understanding, at which successful segmentation of road scene elements can be achieved with a mean intersection over union score close to 1.0. There is a need for more of a unified approach to road scene segmentation for use in self-driving systems. Previous works have demonstrated how deep learning methods can be combined to improve the segmentation and perception performance of road scene understanding systems. This paper proposes a novel segmentation system that uses fully connected networks, attention mechanisms, and multiple-input data stream fusion to improve segmentation performance. Results show comparable performance compared to previous works, with a mean intersection over union of 87.4% on the Cityscapes dataset.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA