Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 90
Filtrar
1.
Sci Rep ; 14(1): 20615, 2024 09 04.
Artículo en Inglés | MEDLINE | ID: mdl-39232028

RESUMEN

To meet the needs of automated medical analysis of brain tumor magnetic resonance imaging, this study introduces an enhanced instance segmentation method built upon mask region-based convolutional neural network. By incorporating squeeze-and-excitation networks, a channel attention mechanism, and concatenated attention neural network, a spatial attention mechanism, the model can more adeptly focus on the critical regions and finer details of brain tumors. Residual network-50 combined attention module and feature pyramid network as the backbone network to effectively capture multi-scale characteristics of brain tumors. At the same time, the region proposal network and region of interest align technology were used to ensure that the segmentation area matched the actual tumor morphology. The originality of the research lies in the deep residual network that combines attention mechanism with feature pyramid network to replace the backbone based on mask region convolutional neural network, achieving an improvement in the efficiency of brain tumor feature extraction. After a series of experiments, the precision of the model is 90.72%, which is 0.76% higher than that of the original model. Recall was 91.68%, an increase of 0.95%; Mean Intersection over Union was 94.56%, an increase of 1.39%. This method achieves precise segmentation of brain tumor magnetic resonance imaging, and doctors can easily and accurately locate the tumor area through the segmentation results, thereby quickly measuring the diameter, area, and other information of the tumor, providing doctors with more comprehensive diagnostic information.


Asunto(s)
Neoplasias Encefálicas , Imagen por Resonancia Magnética , Redes Neurales de la Computación , Humanos , Neoplasias Encefálicas/diagnóstico por imagen , Neoplasias Encefálicas/patología , Imagen por Resonancia Magnética/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Algoritmos , Aprendizaje Profundo , Interpretación de Imagen Asistida por Computador/métodos
2.
Sci Rep ; 14(1): 18313, 2024 Aug 07.
Artículo en Inglés | MEDLINE | ID: mdl-39112496

RESUMEN

Object detector based on fully convolutional network achieves excellent performance. However, existing detection algorithms still face challenges such as low detection accuracy in dense scenes and issues with occlusion of dense targets. To address these two challenges, we propose an Global Remote Feature Modulation End-to-End (GRFME2E) detection algorithm. In the feature extraction phase of our algorithm, we introduces the Concentric Attention Feature Pyramid Network (CAFPN). The CAFPN captures direction-aware and position-sensitive information, as well as global remote dependencies of features in deep layers by combining Coordinate Attention and Multilayer Perceptron. These features are used to modulate the front-end shallow features, enhancing inter-layer feature adjustment to obtain comprehensive and distinctive feature representations.In the detector part, we introduce the Two-Stage Detection Head (TS Head). This head employs the First-One-to-Few (F-O2F) module to detect slightly or unobstructed objects. Additionally, it uses masks to suppress already detected instances, and then feeds them to the Second-One-to-Few (S-O2F) module to identify those that are heavily occluded. The results from both detection stages are merged to produce the final output, ensuring the detection of objects whether they are slightly obscured, unobstructed, or heavily occluded. Experimental results on the pig detection dataset demonstrate that our GRFME2E achieves an accuracy of 98.4%. In addition, more extensive experimental results show that on the CrowdHuman dataset, our GRFME2E achieves 91.8% and outperforms other methods.

3.
J Biophotonics ; : e202400197, 2024 Aug 02.
Artículo en Inglés | MEDLINE | ID: mdl-39092484

RESUMEN

Photoacoustic computed tomography (PACT) has centimeter-level imaging ability and can be used to detect the human body. However, strong photoacoustic signals from skin cover deep tissue information, hindering the frontal display and analysis of photoacoustic images of deep regions of interest. Therefore, we propose a 2.5 D deep learning model based on feature pyramid structure and single-type skin annotation to extract the skin region, and design a mask generation algorithm to remove skin automatically. PACT imaging experiments on the human periphery blood vessel verified the correctness our proposed skin-removal method. Compared with previous studies, our method exhibits high robustness to the uneven illumination, irregular skin boundary, and reconstruction artifacts in the images, and the reconstruction errors of PACT images decreased by 20% ~ 90% with a 1.65 dB improvement in the signal-to-noise ratio at the same time. This study may provide a promising way for high-definition PACT imaging of deep tissues.

4.
Sensors (Basel) ; 24(16)2024 Aug 18.
Artículo en Inglés | MEDLINE | ID: mdl-39205030

RESUMEN

Abnormal valve positions can lead to fluctuations in the process industry, potentially triggering serious accidents. For processes that frequently require operational switching, such as green chemical processes based on renewable energy or biotechnological fermentation processes, this issue becomes even more severe. Despite this risk, many plants still rely on manual inspections to check valve status. The widespread use of cameras in large plants now makes it feasible to monitor valve positions through computer vision technology. This paper proposes a novel real-time valve monitoring approach based on computer vision to detect abnormalities in valve positions. Utilizing an improved network architecture based on YOLO V8, the method performs valve detection and feature recognition. To address the challenge of small, relatively fixed-position valves in the images, a coord attention module is introduced, embedding position information into the feature channels and enhancing the accuracy of valve rotation feature extraction. The valve position is then calculated using a rotation algorithm with the valve's center point and bounding box coordinates, triggering an alarm for valves that exceed a pre-set threshold. The accuracy and generalization ability of the proposed approach are evaluated through experiments on three different types of valves in two industrial scenarios. The results demonstrate that the method meets the accuracy and robustness standards required for real-time valve monitoring in industrial applications.

5.
Sensors (Basel) ; 24(16)2024 Aug 20.
Artículo en Inglés | MEDLINE | ID: mdl-39205068

RESUMEN

Referring video object segmentation (R-VOS) is a fundamental vision-language task which aims to segment the target referred by language expression in all video frames. Existing query-based R-VOS methods have conducted in-depth exploration of the interaction and alignment between visual and linguistic features but fail to transfer the information of the two modalities to the query vector with balanced intensities. Furthermore, most of the traditional approaches suffer from severe information loss in the process of multi-scale feature fusion, resulting in inaccurate segmentation. In this paper, we propose DCT, an end-to-end decoupled cross-modal transformer for referring video object segmentation, to better utilize multi-modal and multi-scale information. Specifically, we first design a Language-Guided Visual Enhancement Module (LGVE) to transmit discriminative linguistic information to visual features of all levels, performing an initial filtering of irrelevant background regions. Then, we propose a decoupled transformer decoder, using a set of object queries to gather entity-related information from both visual and linguistic features independently, mitigating the attention bias caused by feature size differences. Finally, the Cross-layer Feature Pyramid Network (CFPN) is introduced to preserve more visual details by establishing direct cross-layer communication. Extensive experiments have been carried out on A2D-Sentences, JHMDB-Sentences and Ref-Youtube-VOS. The results show that DCT achieves competitive segmentation accuracy compared with the state-of-the-art methods.

6.
J Imaging ; 10(8)2024 Aug 07.
Artículo en Inglés | MEDLINE | ID: mdl-39194980

RESUMEN

For patients at risk of developing either lung cancer or colorectal cancer, the identification of suspect lesions in endoscopic video is an important procedure. The physician performs an endoscopic exam by navigating an endoscope through the organ of interest, be it the lungs or intestinal tract, and performs a visual inspection of the endoscopic video stream to identify lesions. Unfortunately, this entails a tedious, error-prone search over a lengthy video sequence. We propose a deep learning architecture that enables the real-time detection and segmentation of lesion regions from endoscopic video, with our experiments focused on autofluorescence bronchoscopy (AFB) for the lungs and colonoscopy for the intestinal tract. Our architecture, dubbed ESFPNet, draws on a pretrained Mix Transformer (MiT) encoder and a decoder structure that incorporates a new Efficient Stage-Wise Feature Pyramid (ESFP) to promote accurate lesion segmentation. In comparison to existing deep learning models, the ESFPNet model gave superior lesion segmentation performance for an AFB dataset. It also produced superior segmentation results for three widely used public colonoscopy databases and nearly the best results for two other public colonoscopy databases. In addition, the lightweight ESFPNet architecture requires fewer model parameters and less computation than other competing models, enabling the real-time analysis of input video frames. Overall, these studies point to the combined superior analysis performance and architectural efficiency of the ESFPNet for endoscopic video analysis. Lastly, additional experiments with the public colonoscopy databases demonstrate the learning ability and generalizability of ESFPNet, implying that the model could be effective for region segmentation in other domains.

7.
Heliyon ; 10(12): e32931, 2024 Jun 30.
Artículo en Inglés | MEDLINE | ID: mdl-39021898

RESUMEN

Recently, with the remarkable development of deep learning technology, achievements are being updated in various computer vision fields. In particular, the object recognition field is receiving the most attention. Nevertheless, recognition performance for small objects is still challenging. Its performance is of utmost importance in realistic applications such as searching for missing persons through aerial photography. The core structure of the object recognition neural network is the feature pyramid network (FPN). You Only Look Once (YOLO) is the most widely used representative model following this structure. In this study, we proposed an attention-based scale sequence network (ASSN) that improves the scale sequence feature pyramid network (ssFPN), enhancing the performance of the FPN-based detector for small objects. ASSN is a lightweight attention module optimized for FPN-based detectors and has the versatility to be applied to any model with a corresponding structure. The proposed ASSN demonstrated performance improvements compared to the baselines (YOLOv7 and YOLOv8) in average precision (AP) of up to 0.6%. Additionally, the AP for small objects ( A P S ) showed also improvements of up to 1.9%. Furthermore, ASSN exhibits higher performance than ssFPN while achieving lightweightness and optimization, thereby improving computational complexity and processing speed. ASSN is open-source based on YOLO version 7 and 8. This can be found in our public repository: https://github.com/smu-ivpl/ASSN.git.

8.
Sensors (Basel) ; 24(13)2024 Jun 24.
Artículo en Inglés | MEDLINE | ID: mdl-39000865

RESUMEN

In the realm of special equipment, significant advancements have been achieved in fault detection. Nonetheless, faults originating in the equipment manifest with diverse morphological characteristics and varying scales. Certain faults necessitate the extrapolation from global information owing to their occurrence in localized areas. Simultaneously, the intricacies of the inspection area's background easily interfere with the intelligent detection processes. Hence, a refined YOLOv8 algorithm leveraging the Swin Transformer is proposed, tailored for detecting faults in special equipment. The Swin Transformer serves as the foundational network of the YOLOv8 framework, amplifying its capability to concentrate on comprehensive features during the feature extraction, crucial for fault analysis. A multi-head self-attention mechanism regulated by a sliding window is utilized to expand the observation window's scope. Moreover, an asymptotic feature pyramid network is introduced to augment spatial feature extraction for smaller targets. Within this network architecture, adjacent low-level features are merged, while high-level features are gradually integrated into the fusion process. This prevents loss or degradation of feature information during transmission and interaction, enabling accurate localization of smaller targets. Drawing from wheel-rail faults of lifting equipment as an illustration, the proposed method is employed to diagnose an expanded fault dataset generated through transfer learning. Experimental findings substantiate that the proposed method in adeptly addressing numerous challenges encountered in the intelligent fault detection of special equipment. Moreover, it outperforms mainstream target detection models, achieving real-time detection capabilities.

9.
Sensors (Basel) ; 24(13)2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-39001067

RESUMEN

Surface cracks are alluded to as one of the early signs of potential damage to infrastructures. In the same vein, their detection is an imperative task to preserve the structural health and safety of bridges. Human-based visual inspection is acknowledged as the most prevalent means of assessing infrastructures' performance conditions. Nonetheless, it is unreliable, tedious, hazardous, and labor-intensive. This state of affairs calls for the development of a novel YOLOv8-AFPN-MPD-IoU model for instance segmentation and quantification of bridge surface cracks. Firstly, YOLOv8s-Seg is selected as the backbone network to carry out instance segmentation. In addition, an asymptotic feature pyramid network (AFPN) is incorporated to ameliorate feature fusion and overall performance. Thirdly, the minimum point distance (MPD) is introduced as a loss function as a way to better explore the geometric features of surface cracks. Finally, the middle aisle transformation is amalgamated with Euclidean distance to compute the length and width of segmented cracks. Analytical comparisons reveal that this developed deep learning network surpasses several contemporary models, including YOLOv8n, YOLOv8s, YOLOv8m, YOLOv8l, and Mask-RCNN. The YOLOv8s + AFPN + MPDIoU model attains a precision rate of 90.7%, a recall of 70.4%, an F1-score of 79.27%, mAP50 of 75.3%, and mAP75 of 74.80%. In contrast to alternative models, our proposed approach exhibits enhancements across performance metrics, with the F1-score, mAP50, and mAP75 increasing by a minimum of 0.46%, 1.3%, and 1.4%, respectively. The margin of error in the measurement model calculations is maintained at or below 5%. Therefore, the developed model can serve as a useful tool for the accurate characterization and quantification of different types of bridge surface cracks.

10.
Foods ; 13(11)2024 May 29.
Artículo en Inglés | MEDLINE | ID: mdl-38890938

RESUMEN

The classification of Stropharia rugoso-annulata is currently reliant on manual sorting, which may be subject to bias. To improve the sorting efficiency, automated sorting equipment could be used instead. However, sorting naked mushrooms in real time remains a challenging task due to the difficulty of accurately identifying, locating and sorting large quantities of them simultaneously. Models must be deployable on resource-limited devices, making it challenging to achieve both a high accuracy and speed. This paper proposes the APHS-YOLO (YOLOv8n integrated with AKConv, CSPPC and HSFPN modules) model, which is lightweight and efficient, for identifying Stropharia rugoso-annulata of different grades and seasons. This study includes a complete dataset of runners of different grades in spring and autumn. To enhance feature extraction and maintain the recognition accuracy, the new multi-module APHS-YOLO uses HSFPNs (High-Level Screening Feature Pyramid Networks) as a thin-neck structure. It combines an improved lightweight PConv (Partial Convolution)-based convolutional module, CSPPC (Integration of Cross-Stage Partial Networks and Partial Convolution), with the Arbitrary Kernel Convolution (AKConv) module. Additionally, to compensate for the accuracy loss due to lightweighting, APHS-YOLO employs a knowledge refinement technique during training. Compared to the original model, the optimized APHS-YOLO model uses 57.8% less memory and 62.5% fewer computational resources. It has an FPS (frames per second) of over 100 and even achieves 0.1% better accuracy metrics than the original model. These research results provide a valuable reference for the development of automatic sorting equipment for forest farmers.

11.
Sensors (Basel) ; 24(12)2024 Jun 17.
Artículo en Inglés | MEDLINE | ID: mdl-38931702

RESUMEN

To tackle the intricate challenges associated with the low detection accuracy of images taken by unmanned aerial vehicles (UAVs), arising from the diverse sizes and types of objects coupled with limited feature information, we present the SRE-YOLOv8 as an advanced method. Our method enhances the YOLOv8 object detection algorithm by leveraging the Swin Transformer and a lightweight residual feature pyramid network (RE-FPN) structure. Firstly, we introduce an optimized Swin Transformer module into the backbone network to preserve ample global contextual information during feature extraction and to extract a broader spectrum of features using self-attention mechanisms. Subsequently, we integrate a Residual Feature Augmentation (RFA) module and a lightweight attention mechanism named ECA, thereby transforming the original FPN structure to RE-FPN, intensifying the network's emphasis on critical features. Additionally, an SOD (small object detection) layer is incorporated to enhance the network's ability to recognize the spatial information of the model, thus augmenting accuracy in detecting small objects. Finally, we employ a Dynamic Head equipped with multiple attention mechanisms in the object detection head to enhance its performance in identifying low-resolution targets amidst complex backgrounds. Experimental evaluation conducted on the VisDrone2021 dataset reveals a significant advancement, showcasing an impressive 9.2% enhancement over the original YOLOv8 algorithm.

12.
Comput Biol Med ; 177: 108674, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38815486

RESUMEN

Accurate segmentation of pulmonary nodule is essential for subsequent pathological analysis and diagnosis. However, current U-Net architectures often rely on a simple skip connection scheme, leading to the fusion of feature maps with different semantic information, which can have a negative impact on the segmentation model. In response to this challenge, this study introduces a novel U-shaped model specifically designed for pulmonary nodule segmentation. The proposed model incorporates features such as the U-Net backbone, semantic aggregation feature pyramid module, and reverse attention module. The semantic aggregation module combines semantic information with multi-scale features, addressing the semantic gap between the encoder and decoder. The reverse attention module explores missing object parts and captures intricate details by erasing the currently predicted salient regions from side-output features. The proposed model is evaluated using the LIDC-IDRI dataset. Experimental results reveal that the proposed method achieves a dice similarity coefficient of 89.11%and a sensitivity of 90.73 %, outperforming state-of-the-art approaches comprehensively.


Asunto(s)
Semántica , Nódulo Pulmonar Solitario , Humanos , Nódulo Pulmonar Solitario/diagnóstico por imagen , Neoplasias Pulmonares/diagnóstico por imagen , Redes Neurales de la Computación , Tomografía Computarizada por Rayos X/métodos , Algoritmos , Bases de Datos Factuales
13.
Heliyon ; 10(10): e30836, 2024 May 30.
Artículo en Inglés | MEDLINE | ID: mdl-38803980

RESUMEN

Background: Dental cavities are common oral diseases that can lead to pain, discomfort, and eventually, tooth loss. Early detection and treatment of cavities can prevent these negative consequences. We propose CariSeg, an intelligent system composed of four neural networks that result in the detection of cavities in dental X-rays with 99.42% accuracy. Method: The first model of CariSeg, trained using the U-Net architecture, segments the area of interest, the teeth, and crops the radiograph around it. The next component segments the carious lesions and it is an ensemble composed of three architectures: U-Net, Feature Pyramid Network, and DeeplabV3. For tooth identification two merged datasets were used: The Tufts Dental Database consisting of 1000 panoramic radiography images and another dataset of 116 anonymized panoramic X-rays, taken at Noor Medical Imaging Center, Qom. For carious lesion segmentation, a dataset consisting of 150 panoramic X-ray images was acquired from the Department of Oral and Maxillofacial Surgery and Radiology, Iuliu Hatieganu University of Medicine and Pharmacy, Cluj-Napoca. Results: The experiments demonstrate that our approach results in 99.42% accuracy and a mean 68.2% Dice coefficient. Conclusions: AI helps in detecting carious lesions by analyzing dental X-rays and identifying cavities that might be missed by human observers, leading to earlier detection and treatment of cavities and resulting in better oral health outcomes.

14.
J Imaging Inform Med ; 2024 May 17.
Artículo en Inglés | MEDLINE | ID: mdl-38760643

RESUMEN

Accurately identifying and locating lesions in chest X-rays has the potential to significantly enhance diagnostic efficiency, quality, and interpretability. However, current methods primarily focus on detecting of specific diseases in chest X-rays, disregarding the presence of multiple diseases in a single chest X-ray scan. Moreover, the diversity in lesion locations and attributes introduces complexity in accurately discerning specific traits for each lesion, leading to diminished accuracy when detecting multiple diseases. To address these issues, we propose a novel detection framework that enhances multi-scale lesion feature extraction and fusion, improving lesion position perception and subsequently boosting chest multi-disease detection performance. Initially, we construct a multi-scale lesion feature extraction network to tackle the uniqueness of various lesion features and locations, strengthening the global semantic correlation between lesion features and their positions. Following this, we introduce an instance-aware semantic enhancement network that dynamically amalgamates instance-specific features with high-level semantic representations across various scales. This adaptive integration effectively mitigates the loss of detailed information within lesion regions. Additionally, we perform lesion region feature mapping using candidate boxes to preserve crucial positional information, enhancing the accuracy of chest disease detection across multiple scales. Experimental results on the VinDr-CXR dataset reveal a 6% increment in mean average precision (mAP) and an 8.4% improvement in mean recall (mR) when compared to state-of-the-art baselines. This demonstrates the effectiveness of the model in accurately detecting multiple chest diseases by capturing specific features and location information.

15.
Animals (Basel) ; 14(7)2024 Apr 04.
Artículo en Inglés | MEDLINE | ID: mdl-38612345

RESUMEN

The Amur tiger is an important endangered species in the world, and its re-identification (re-ID) plays an important role in regional biodiversity assessment and wildlife resource statistics. This paper focuses on the task of Amur tiger re-ID based on visible light images from screenshots of surveillance videos or camera traps, aiming to solve the problem of low accuracy caused by camera perspective, noisy background noise, changes in motion posture, and deformation of Amur tiger body patterns during the re-ID process. To overcome this challenge, we propose a serial multi-scale feature fusion and enhancement re-ID network of Amur tiger for this task, in which global and local branches are constructed. Specifically, we design a global inverted pyramid multi-scale feature fusion method in the global branch to effectively fuse multi-scale global features and achieve high-level, fine-grained, and deep semantic feature preservation. We also design a local dual-domain attention feature enhancement method in the local branch, further enhancing local feature extraction and fusion by dividing local feature blocks. Based on the above model structure, we evaluated the effectiveness and feasibility of the model on the public dataset of the Amur Tiger Re-identification in the Wild (ATRW), and achieved good results on mAP, Rank-1, and Rank-5, demonstrating a certain competitiveness. In addition, since our proposed model does not require the introduction of additional expensive annotation information and does not incorporate other pre-training modules, it has important advantages such as strong transferability and simple training.

16.
Sci Rep ; 14(1): 8012, 2024 04 05.
Artículo en Inglés | MEDLINE | ID: mdl-38580704

RESUMEN

The objective of human pose estimation (HPE) derived from deep learning aims to accurately estimate and predict the human body posture in images or videos via the utilization of deep neural networks. However, the accuracy of real-time HPE tasks is still to be improved due to factors such as partial occlusion of body parts and limited receptive field of the model. To alleviate the accuracy loss caused by these issues, this paper proposes a real-time HPE model called CCAM - Person based on the YOLOv8 framework. Specifically, we have improved the backbone and neck of the YOLOv8x-pose real-time HPE model to alleviate the feature loss and receptive field constraints. Secondly, we introduce the context coordinate attention module (CCAM) to augment the model's focus on salient features, reduce background noise interference, alleviate key point regression failure caused by limb occlusion, and improve the accuracy of pose estimation. Our approach attains competitive results on multiple metrics of two open-source datasets, MS COCO 2017 and CrowdPose. Compared with the baseline model YOLOv8x-pose, CCAM-Person improves the average precision by 2.8% and 3.5% on the two datasets, respectively.


Asunto(s)
Benchmarking , Extremidades , Humanos , Redes Neurales de la Computación , Postura , Grabación de Cinta de Video
17.
Front Plant Sci ; 15: 1382802, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38654901

RESUMEN

When detecting tomato leaf diseases in natural environments, factors such as changes in lighting, occlusion, and the small size of leaf lesions pose challenges to detection accuracy. Therefore, this study proposes a tomato leaf disease detection method based on attention mechanisms and multi-scale feature fusion. Firstly, the Convolutional Block Attention Module (CBAM) is introduced into the backbone feature extraction network to enhance the ability to extract lesion features and suppress the effects of environmental interference. Secondly, shallow feature maps are introduced into the re-parameterized generalized feature pyramid network (RepGFPN), constructing a new multi-scale re-parameterized generalized feature fusion module (BiRepGFPN) to enhance feature fusion expression and improve the localization ability for small lesion features. Finally, the BiRepGFPN replaces the Path Aggregation Feature Pyramid Network (PAFPN) in the YOLOv6 model to achieve effective fusion of deep semantic and shallow spatial information. Experimental results indicate that, when evaluated on the publicly available PlantDoc dataset, the model's mean average precision (mAP) showed improvements of 7.7%, 11.8%, 3.4%, 5.7%, 4.3%, and 2.6% compared to YOLOX, YOLOv5, YOLOv6, YOLOv6-s, YOLOv7, and YOLOv8, respectively. When evaluated on the tomato leaf disease dataset, the model demonstrated a precision of 92.9%, a recall rate of 95.2%, an F1 score of 94.0%, and a mean average precision (mAP) of 93.8%, showing improvements of 2.3%, 4.0%, 3.1%, and 2.7% respectively compared to the baseline model. These results indicate that the proposed detection method possesses significant detection performance and generalization capabilities.

18.
Front Physiol ; 15: 1304829, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38455845

RESUMEN

Introduction: Precise classification has an important role in treatment of pressure injury (PI), while current machine-learning or deeplearning based methods of PI classification remain low accuracy. Methods: In this study, we developed a deeplearning based weighted feature fusion architecture for fine-grained classification, which combines a top-down and bottom-up pathway to fuse high-level semantic information and low-level detail representation. We validated it in our established database that consist of 1,519 images from multi-center clinical cohorts. ResNeXt was set as the backbone network. Results: We increased the accuracy of stage 3 PI from 60.3% to 76.2% by adding weighted feature pyramid network (wFPN). The accuracy for stage 1, 2, 4 PI were 0.870, 0.788, and 0.845 respectively. We found the overall accuracy, precision, recall, and F1-score of our network were 0.815, 0.808, 0.816, and 0.811 respectively. The area under the receiver operating characteristic curve was 0.940. Conclusions: Compared with current reported study, our network significantly increased the overall accuracy from 75% to 81.5% and showed great performance in predicting each stage. Upon further validation, our study will pave the path to the clinical application of our network in PI management.

19.
ISA Trans ; 146: 221-235, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38326214

RESUMEN

Effective condition monitoring can improve the reliability of the turbine and reduce its downtime. However, due to the complexity of the operating conditions, the monitoring data is always mixed with poor-quality data. Poor-quality data mixed in monitoring tasks disrupts long-term dependency on data, which challenges traditional condition monitoring methods to work. To solve it, a joint reparameterization feature pyramid network (JRFPN) is proposed. Firstly, three different reparameterization tricks are designed to reform temporal information and exchange cross-temporal information, to alleviate the damage of long-term dependency. Secondly, a joint condition monitoring framework is designed, aiming to suppress feature confounding between poor-quality data and faulty data. The auxiliary task is trained to extract the degradation trend. The main task fights against feature confounding and dynamically delineates the failure threshold. The degradation trend and failure threshold decisions are corrected for each other to make the final joint state inference. Besides, considering the different quality of the monitoring variables, a channel weighting mechanism is designed to strengthen the ability of JRFPN. The measured data proved that JRFPN is more effective than other methods.

20.
J Imaging Inform Med ; 37(1): 280-296, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38343216

RESUMEN

Cervical cancer is a significant health problem worldwide, and early detection and treatment are critical to improving patient outcomes. To address this challenge, a deep learning (DL)-based cervical classification system is proposed using 3D convolutional neural network and Vision Transformer (ViT) module. The proposed model leverages the capability of 3D CNN to extract spatiotemporal features from cervical images and employs the ViT model to capture and learn complex feature representations. The model consists of an input layer that receives cervical images, followed by a 3D convolution block, which extracts features from the images. The feature maps generated are down-sampled using max-pooling block to eliminate redundant information and preserve important features. Four Vision Transformer models are employed to extract efficient feature maps of different levels of abstraction. The output of each Vision Transformer model is an efficient set of feature maps that captures spatiotemporal information at a specific level of abstraction. The feature maps generated by the Vision Transformer models are then supplied into the 3D feature pyramid network (FPN) module for feature concatenation. The 3D squeeze-and-excitation (SE) block is employed to obtain efficient feature maps that recalibrate the feature responses of the network based on the interdependencies between different feature maps, thereby improving the discriminative power of the model. At last, dimension minimization of feature maps is executed using 3D average pooling layer. Its output is then fed into a kernel extreme learning machine (KELM) for classification into one of the five classes. The KELM uses radial basis kernel function (RBF) for mapping features in high-dimensional feature space and classifying the input samples. The superiority of the proposed model is known using simulation results, achieving an accuracy of 98.6%, demonstrating its potential as an effective tool for cervical cancer classification. Also, it can be used as a diagnostic supportive tool to assist medical experts in accurately identifying cervical cancer in patients.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA