Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 42
Filtrar
Más filtros











Intervalo de año de publicación
1.
Sensors (Basel) ; 24(14)2024 Jul 11.
Artículo en Inglés | MEDLINE | ID: mdl-39065889

RESUMEN

Remote sensing images are characterized by high complexity, significant scale variations, and abundant details, which present challenges for existing deep learning-based super-resolution reconstruction methods. These algorithms often exhibit limited convolutional receptive fields and thus struggle to establish global contextual information, which can lead to an inadequate utilization of both global and local details and limited generalization capabilities. To address these issues, this study introduces a novel multi-branch residual hybrid attention block (MBRHAB). This innovative approach is part of a proposed super-resolution reconstruction model for remote sensing data, which incorporates various attention mechanisms to enhance performance. First, the model employs window-based multi-head self-attention to model long-range dependencies in images. A multi-branch convolution module (MBCM) is then constructed to enhance the convolutional receptive field for improved representation of global information. Convolutional attention is subsequently combined across channels and spatial dimensions to strengthen associations between different features and areas containing crucial details, thereby augmenting local semantic information. Finally, the model adopts a parallel design to enhance computational efficiency. Generalization performance was assessed using a cross-dataset approach involving two training datasets (NWPU-RESISC45 and PatternNet) and a third test dataset (UCMerced-LandUse). Experimental results confirmed that the proposed method surpassed the existing super-resolution algorithms, including Bicubic interpolation, SRCNN, ESRGAN, Real-ESRGAN, IRN, and DSSR in the metrics of PSNR and SSIM across various magnifications scales.

2.
J Imaging ; 10(6)2024 Jun 03.
Artículo en Inglés | MEDLINE | ID: mdl-38921612

RESUMEN

The automatic segmentation of cardiac computed tomography (CT) and magnetic resonance imaging (MRI) plays a pivotal role in the prevention and treatment of cardiovascular diseases. In this study, we propose an efficient network based on the multi-scale, multi-head self-attention (MSMHSA) mechanism. The incorporation of this mechanism enables us to achieve larger receptive fields, facilitating the accurate segmentation of whole heart structures in both CT and MRI images. Within this network, features extracted from the shallow feature extraction network undergo a MHSA mechanism that closely aligns with human vision, resulting in the extraction of contextual semantic information more comprehensively and accurately. To improve the precision of cardiac substructure segmentation across varying sizes, our proposed method introduces three MHSA networks at distinct scales. This approach allows for fine-tuning the accuracy of micro-object segmentation by adapting the size of the segmented images. The efficacy of our method is rigorously validated on the Multi-Modality Whole Heart Segmentation (MM-WHS) Challenge 2017 dataset, demonstrating competitive results and the accurate segmentation of seven cardiac substructures in both cardiac CT and MRI images. Through comparative experiments with advanced transformer-based models, our study provides compelling evidence that despite the remarkable achievements of transformer-based models, the fusion of CNN models and self-attention remains a simple yet highly effective approach for dual-modality whole heart segmentation.

3.
Front Genet ; 15: 1408688, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38873109

RESUMEN

N4-acetylcysteine (ac4C) is a chemical modification in mRNAs that alters the structure and function of mRNA by adding an acetyl group to the N4 position of cytosine. Researchers have shown that ac4C is closely associated with the occurrence and development of various cancers. Therefore, accurate prediction of ac4C modification sites on human mRNA is crucial for revealing its role in diseases and developing new diagnostic and therapeutic strategies. However, existing deep learning models still have limitations in prediction accuracy and generalization ability, which restrict their effectiveness in handling complex biological sequence data. This paper introduces a deep learning-based model, STM-ac4C, for predicting ac4C modification sites on human mRNA. The model combines the advantages of selective kernel convolution, temporal convolutional networks, and multi-head self-attention mechanisms to effectively extract and integrate multi-level features of RNA sequences, thereby achieving high-precision prediction of ac4C sites. On the independent test dataset, STM-ac4C showed improvements of 1.81%, 3.5%, and 0.37% in accuracy, Matthews correlation coefficient, and area under the curve, respectively, compared to the existing state-of-the-art technologies. Moreover, its performance on additional balanced and imbalanced datasets also confirmed the model's robustness and generalization ability. Various experimental results indicate that STM-ac4C outperforms existing methods in predictive performance. In summary, STM-ac4C excels in predicting ac4C modification sites on human mRNA, providing a powerful new tool for a deeper understanding of the biological significance of mRNA modifications and cancer treatment. Additionally, the model reveals key sequence features that influence the prediction of ac4C sites through sequence region impact analysis, offering new perspectives for future research. The source code and experimental data are available at https://github.com/ymy12341/STM-ac4C.

4.
BMC Cancer ; 24(1): 683, 2024 Jun 05.
Artículo en Inglés | MEDLINE | ID: mdl-38840078

RESUMEN

BACKGROUND: MicroRNAs (miRNAs) emerge in various organisms, ranging from viruses to humans, and play crucial regulatory roles within cells, participating in a variety of biological processes. In numerous prediction methods for miRNA-disease associations, the issue of over-dependence on both similarity measurement data and the association matrix still hasn't been improved. In this paper, a miRNA-Disease association prediction model (called TP-MDA) based on tree path global feature extraction and fully connected artificial neural network (FANN) with multi-head self-attention mechanism is proposed. The TP-MDA model utilizes an association tree structure to represent the data relationships, multi-head self-attention mechanism for extracting feature vectors, and fully connected artificial neural network with 5-fold cross-validation for model training. RESULTS: The experimental results indicate that the TP-MDA model outperforms the other comparative models, AUC is 0.9714. In the case studies of miRNAs associated with colorectal cancer and lung cancer, among the top 15 miRNAs predicted by the model, 12 in colorectal cancer and 15 in lung cancer were validated respectively, the accuracy is as high as 0.9227. CONCLUSIONS: The model proposed in this paper can accurately predict the miRNA-disease association, and can serve as a valuable reference for data mining and association prediction in the fields of life sciences, biology, and disease genetics, among others.


Asunto(s)
MicroARNs , Redes Neurales de la Computación , Humanos , MicroARNs/genética , Predisposición Genética a la Enfermedad , Biología Computacional/métodos , Neoplasias Colorrectales/genética , Neoplasias Pulmonares/genética , Algoritmos
5.
Front Genet ; 15: 1381997, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38770418

RESUMEN

Accurate identification of potential drug-target pairs is a crucial step in drug development and drug repositioning, which is characterized by the ability of the drug to bind to and modulate the activity of the target molecule, resulting in the desired therapeutic effect. As machine learning and deep learning technologies advance, an increasing number of models are being engaged for the prediction of drug-target interactions. However, there is still a great challenge to improve the accuracy and efficiency of predicting. In this study, we proposed a deep learning method called Multi-source Information Fusion and Attention Mechanism for Drug-Target Interaction (MIFAM-DTI) to predict drug-target interactions. Firstly, the physicochemical property feature vector and the Molecular ACCess System molecular fingerprint feature vector of a drug were extracted based on its SMILES sequence. The dipeptide composition feature vector and the Evolutionary Scale Modeling -1b feature vector of a target were constructed based on its amino acid sequence information. Secondly, the PCA method was employed to reduce the dimensionality of the four feature vectors, and the adjacency matrices were constructed by calculating the cosine similarity. Thirdly, the two feature vectors of each drug were concatenated and the two adjacency matrices were subjected to a logical OR operation. And then they were fed into a model composed of graph attention network and multi-head self-attention to obtain the final drug feature vectors. With the same method, the final target feature vectors were obtained. Finally, these final feature vectors were concatenated, which served as the input to a fully connected layer, resulting in the prediction output. MIFAM-DTI not only integrated multi-source information to capture the drug and target features more comprehensively, but also utilized the graph attention network and multi-head self-attention to autonomously learn attention weights and more comprehensively capture information in sequence data. Experimental results demonstrated that MIFAM-DTI outperformed state-of-the-art methods in terms of AUC and AUPR. Case study results of coenzymes involved in cellular energy metabolism also demonstrated the effectiveness and practicality of MIFAM-DTI. The source code and experimental data for MIFAM-DTI are available at https://github.com/Search-AB/MIFAM-DTI.

6.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38762789

RESUMEN

Identifying drug-target interactions (DTIs) holds significant importance in drug discovery and development, playing a crucial role in various areas such as virtual screening, drug repurposing and identification of potential drug side effects. However, existing methods commonly exploit only a single type of feature from drugs and targets, suffering from miscellaneous challenges such as high sparsity and cold-start problems. We propose a novel framework called MSI-DTI (Multi-Source Information-based Drug-Target Interaction Prediction) to enhance prediction performance, which obtains feature representations from different views by integrating biometric features and knowledge graph representations from multi-source information. Our approach involves constructing a Drug-Target Knowledge Graph (DTKG), obtaining multiple feature representations from diverse information sources for SMILES sequences and amino acid sequences, incorporating network features from DTKG and performing an effective multi-source information fusion. Subsequently, we employ a multi-head self-attention mechanism coupled with residual connections to capture higher-order interaction information between sparse features while preserving lower-order information. Experimental results on DTKG and two benchmark datasets demonstrate that our MSI-DTI outperforms several state-of-the-art DTIs prediction methods, yielding more accurate and robust predictions. The source codes and datasets are publicly accessible at https://github.com/KEAML-JLU/MSI-DTI.


Asunto(s)
Descubrimiento de Drogas , Biología Computacional/métodos , Algoritmos , Humanos
7.
Sensors (Basel) ; 24(6)2024 Mar 11.
Artículo en Inglés | MEDLINE | ID: mdl-38544062

RESUMEN

In order to improve the real-time performance of gesture recognition by a micro-Doppler map of mmWave radar, the point cloud based gesture recognition for mmWave radar is proposed in this paper. Two steps are carried out for mmWave radar-based gesture recognition. The first step is to estimate the point cloud of the gestures by 3D-FFT and the peak grouping. The second step is to train the TRANS-CNN model by combining the multi-head self-attention and the 1D-convolutional network so as to extract the features in the point cloud data at a deeper level to categorize the gestures. In the experiments, TI mmWave radar sensor IWR1642 is used as a benchmark to evaluate the feasibility of the proposed approach. The results show that the accuracy of the gesture recognition reaches 98.5%. In order to prove the effectiveness of our approach, a simply 2Tx2Rx radar sensor is developed in our lab, and the accuracy of recognition reaches 97.1%. The results show that our proposed gesture recognition approach achieves the best performance in real time with limited training data in comparison with the existing methods.

8.
BMC Genomics ; 25(1): 86, 2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38254021

RESUMEN

BACKGROUND AND OBJECTIVES: Comprehensive analysis of multi-omics data is crucial for accurately formulating effective treatment plans for complex diseases. Supervised ensemble methods have gained popularity in recent years for multi-omics data analysis. However, existing research based on supervised learning algorithms often fails to fully harness the information from unlabeled nodes and overlooks the latent features within and among different omics, as well as the various associations among features. Here, we present a novel multi-omics integrative method MOSEGCN, based on the Transformer multi-head self-attention mechanism and Graph Convolutional Networks(GCN), with the aim of enhancing the accuracy of complex disease classification. MOSEGCN first employs the Transformer multi-head self-attention mechanism and Similarity Network Fusion (SNF) to separately learn the inherent correlations of latent features within and among different omics, constructing a comprehensive view of diseases. Subsequently, it feeds the learned crucial information into a self-ensembling Graph Convolutional Network (SEGCN) built upon semi-supervised learning methods for training and testing, facilitating a better analysis and utilization of information from multi-omics data to achieve precise classification of disease subtypes. RESULTS: The experimental results show that MOSEGCN outperforms several state-of-the-art multi-omics integrative analysis approaches on three types of omics data: mRNA expression data, microRNA expression data, and DNA methylation data, with accuracy rates of 83.0% for Alzheimer's disease and 86.7% for breast cancer subtyping. Furthermore, MOSEGCN exhibits strong generalizability on the GBM dataset, enabling the identification of important biomarkers for related diseases. CONCLUSION: MOSEGCN explores the significant relationship information among different omics and within each omics' latent features, effectively leveraging labeled and unlabeled information to further enhance the accuracy of complex disease classification. It also provides a promising approach for identifying reliable biomarkers, paving the way for personalized medicine.


Asunto(s)
Enfermedad de Alzheimer , Multiómica , Humanos , Metilación de ADN , Algoritmos , Biomarcadores
9.
Methods ; 222: 142-151, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38242383

RESUMEN

Protein-protein interactions play an important role in various biological processes. Interaction among proteins has a wide range of applications. Therefore, the correct identification of protein-protein interactions sites is crucial. In this paper, we propose a novel predictor for protein-protein interactions sites, AGF-PPIS, where we utilize a multi-head self-attention mechanism (introducing a graph structure), graph convolutional network, and feed-forward neural network. We use the Euclidean distance between each protein residue to generate the corresponding protein graph as the input of AGF-PPIS. On the independent test dataset Test_60, AGF-PPIS achieves superior performance over comparative methods in terms of seven different evaluation metrics (ACC, precision, recall, F1-score, MCC, AUROC, AUPRC), which fully demonstrates the validity and superiority of the proposed AGF-PPIS model. The source codes and the steps for usage of AGF-PPIS are available at https://github.com/fxh1001/AGF-PPIS.


Asunto(s)
Benchmarking , Inhibidores de la Bomba de Protones , Redes Neurales de la Computación , Programas Informáticos
10.
Sensors (Basel) ; 23(20)2023 Oct 17.
Artículo en Inglés | MEDLINE | ID: mdl-37896622

RESUMEN

Sugarcane is an important raw material for sugar and chemical production. However, in recent years, various sugarcane diseases have emerged, severely impacting the national economy. To address the issue of identifying diseases in sugarcane leaf sections, this paper proposes the SE-VIT hybrid network. Unlike traditional methods that directly use models for classification, this paper compares threshold, K-means, and support vector machine (SVM) algorithms for extracting leaf lesions from images. Due to SVM's ability to accurately segment these lesions, it is ultimately selected for the task. The paper introduces the SE attention module into ResNet-18 (CNN), enhancing the learning of inter-channel weights. After the pooling layer, multi-head self-attention (MHSA) is incorporated. Finally, with the inclusion of 2D relative positional encoding, the accuracy is improved by 5.1%, precision by 3.23%, and recall by 5.17%. The SE-VIT hybrid network model achieves an accuracy of 97.26% on the PlantVillage dataset. Additionally, when compared to four existing classical neural network models, SE-VIT demonstrates significantly higher accuracy and precision, reaching 89.57% accuracy. Therefore, the method proposed in this paper can provide technical support for intelligent management of sugarcane plantations and offer insights for addressing plant diseases with limited datasets.


Asunto(s)
Saccharum , Algoritmos , Grano Comestible , Inteligencia , Hojas de la Planta
11.
Sensors (Basel) ; 23(19)2023 Sep 22.
Artículo en Inglés | MEDLINE | ID: mdl-37836863

RESUMEN

Stuttering, a prevalent neurodevelopmental disorder, profoundly affects fluent speech, causing involuntary interruptions and recurrent sound patterns. This study addresses the critical need for the accurate classification of stuttering types. The researchers introduce "TranStutter", a pioneering Convolution-free Transformer-based DL model, designed to excel in speech disfluency classification. Unlike conventional methods, TranStutter leverages Multi-Head Self-Attention and Positional Encoding to capture intricate temporal patterns, yielding superior accuracy. In this study, the researchers employed two benchmark datasets: the Stuttering Events in Podcasts Dataset (SEP-28k) and the FluencyBank Interview Subset. SEP-28k comprises 28,177 audio clips from podcasts, meticulously annotated into distinct dysfluent and non-dysfluent labels, including Block (BL), Prolongation (PR), Sound Repetition (SR), Word Repetition (WR), and Interjection (IJ). The FluencyBank subset encompasses 4144 audio clips from 32 People Who Stutter (PWS), providing a diverse set of speech samples. TranStutter's performance was assessed rigorously. On SEP-28k, the model achieved an impressive accuracy of 88.1%. Furthermore, on the FluencyBank dataset, TranStutter demonstrated its efficacy with an accuracy of 80.6%. These results highlight TranStutter's significant potential in revolutionizing the diagnosis and treatment of stuttering, thereby contributing to the evolving landscape of speech pathology and neurodevelopmental research. The innovative integration of Multi-Head Self-Attention and Positional Encoding distinguishes TranStutter, enabling it to discern nuanced disfluencies with unparalleled precision. This novel approach represents a substantial leap forward in the field of speech pathology, promising more accurate diagnostics and targeted interventions for individuals with stuttering disorders.


Asunto(s)
Aprendizaje Profundo , Tartamudeo , Humanos , Habla , Tartamudeo/diagnóstico , Trastornos del Habla , Medición de la Producción del Habla
12.
Int J Mol Sci ; 24(18)2023 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-37762445

RESUMEN

Accurate identification of potential drug-target interactions (DTIs) is a crucial task in drug development and repositioning. Despite the remarkable progress achieved in recent years, improving the performance of DTI prediction still presents significant challenges. In this study, we propose a novel end-to-end deep learning model called AMMVF-DTI (attention mechanism and multi-view fusion), which leverages a multi-head self-attention mechanism to explore varying degrees of interaction between drugs and target proteins. More importantly, AMMVF-DTI extracts interactive features between drugs and proteins from both node-level and graph-level embeddings, enabling a more effective modeling of DTIs. This advantage is generally lacking in existing DTI prediction models. Consequently, when compared to many of the start-of-the-art methods, AMMVF-DTI demonstrated excellent performance on the human, C. elegans, and DrugBank baseline datasets, which can be attributed to its ability to incorporate interactive information and mine features from both local and global structures. The results from additional ablation experiments also confirmed the importance of each module in our AMMVF-DTI model. Finally, a case study is presented utilizing our model for COVID-19-related DTI prediction. We believe the AMMVF-DTI model can not only achieve reasonable accuracy in DTI prediction, but also provide insights into the understanding of potential interactions between drugs and targets.


Asunto(s)
COVID-19 , Humanos , Animales , Caenorhabditis elegans , Desarrollo de Medicamentos , Interacciones Farmacológicas
13.
Sensors (Basel) ; 23(16)2023 Aug 16.
Artículo en Inglés | MEDLINE | ID: mdl-37631742

RESUMEN

Infrared and visible image fusion aims to generate a single fused image that not only contains rich texture details and salient objects, but also facilitates downstream tasks. However, existing works mainly focus on learning different modality-specific or shared features, and ignore the importance of modeling cross-modality features. To address these challenges, we propose Dual-branch Progressive learning for infrared and visible image fusion with a complementary self-Attention and Convolution (DPACFuse) network. On the one hand, we propose Cross-Modality Feature Extraction (CMEF) to enhance information interaction and the extraction of common features across modalities. In addition, we introduce a high-frequency gradient convolution operation to extract fine-grained information and suppress high-frequency information loss. On the other hand, to alleviate the CNN issues of insufficient global information extraction and computation overheads of self-attention, we introduce the ACmix, which can fully extract local and global information in the source image with a smaller computational overhead than pure convolution or pure self-attention. Extensive experiments demonstrated that the fused images generated by DPACFuse not only contain rich texture information, but can also effectively highlight salient objects. Additionally, our method achieved approximately 3% improvement over the state-of-the-art methods in MI, Qabf, SF, and AG evaluation indicators. More importantly, our fused images enhanced object detection and semantic segmentation by approximately 10%, compared to using infrared and visible images separately.

14.
BMC Bioinformatics ; 24(1): 323, 2023 Aug 26.
Artículo en Inglés | MEDLINE | ID: mdl-37633938

RESUMEN

BACKGROUND: Prediction of drug-target interaction (DTI) is an essential step for drug discovery and drug reposition. Traditional methods are mostly time-consuming and labor-intensive, and deep learning-based methods address these limitations and are applied to engineering. Most of the current deep learning methods employ representation learning of unimodal information such as SMILES sequences, molecular graphs, or molecular images of drugs. In addition, most methods focus on feature extraction from drug and target alone without fusion learning from drug-target interacting parties, which may lead to insufficient feature representation. MOTIVATION: In order to capture more comprehensive drug features, we utilize both molecular image and chemical features of drugs. The image of the drug mainly has the structural information and spatial features of the drug, while the chemical information includes its functions and properties, which can complement each other, making drug representation more effective and complete. Meanwhile, to enhance the interactive feature learning of drug and target, we introduce a bidirectional multi-head attention mechanism to improve the performance of DTI. RESULTS: To enhance feature learning between drugs and targets, we propose a novel model based on deep learning for DTI task called MCL-DTI which uses multimodal information of drug and learn the representation of drug-target interaction for drug-target prediction. In order to further explore a more comprehensive representation of drug features, this paper first exploits two multimodal information of drugs, molecular image and chemical text, to represent the drug. We also introduce to use bi-rectional multi-head corss attention (MCA) method to learn the interrelationships between drugs and targets. Thus, we build two decoders, which include an multi-head self attention (MSA) block and an MCA block, for cross-information learning. We use a decoder for the drug and target separately to obtain the interaction feature maps. Finally, we feed these feature maps generated by decoders into a fusion block for feature extraction and output the prediction results. CONCLUSIONS: MCL-DTI achieves the best results in all the three datasets: Human, C. elegans and Davis, including the balanced datasets and an unbalanced dataset. The results on the drug-drug interaction (DDI) task show that MCL-DTI has a strong generalization capability and can be easily applied to other tasks.


Asunto(s)
Caenorhabditis elegans , Entrenamiento Simulado , Humanos , Animales , Interacciones Farmacológicas , Sistemas de Liberación de Medicamentos , Descubrimiento de Drogas
15.
Sensors (Basel) ; 23(15)2023 Jul 28.
Artículo en Inglés | MEDLINE | ID: mdl-37571539

RESUMEN

Convolutional neural networks have achieved good results in target detection in many application scenarios, but convolutional neural networks still face great challenges when facing scenarios with small target sizes and complex background environments. To solve the problem of low accuracy of infrared weak target detection in complex scenes, and considering the real-time requirements of the detection task, we choose the YOLOv5s target detection algorithm for improvement. We add the Bottleneck Transformer structure and CoordConv to the network to optimize the model parameters and improve the performance of the detection network. Meanwhile, a two-dimensional Gaussian distribution is used to describe the importance of pixel points in the target frame, and the normalized Guassian Wasserstein distance (NWD) is used to measure the similarity between the prediction frame and the true frame to characterize the loss function of weak targets, which will help highlight the targets with flat positional deviation transformation and improve the detection accuracy. Finally, through experimental verification, compared with other mainstream detection algorithms, the improved algorithm in this paper significantly improves the target detection accuracy, with the mAP reaching 96.7 percent, which is 2.2 percentage points higher compared with Yolov5s.

16.
Sensors (Basel) ; 23(15)2023 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-37571637

RESUMEN

With the rapid development of fingerprint recognition systems, fingerprint liveness detection is gradually becoming regarded as the main countermeasure to protect the fingerprint identification system from spoofing attacks. Convolutional neural networks have shown great potential in fingerprint liveness detection. However, the generalization ability of the deep network model for unknown materials, and the computational complexity of the network, need to be further improved. A new lightweight fingerprint liveness detection network is here proposed to distinguish fake fingerprints from real ones. The method includes mainly foreground extraction, fingerprint image blocking, style transfer based on CycleGan and an improved ResNet with multi-head self-attention mechanism. The proposed method can effectively extract ROI and obtain the end-to-end data structure, which increases the amount of data. For false fingerprints generated from unknown materials, the use of CycleGan network improves the model generalization ability. The introduction of Transformer with MHSA in the improved ResNet improves detection performance and reduces computing overhead. Experiments on the LivDet2011, LivDet2013 and LivDet2015 datasets showed that the proposed method achieves good results. For example, on the LivDet2015 dataset, our methods achieved an average classification error of 1.72 across all sensors, while significantly reducing network parameters, and the overall parameter number was only 0.83 M. At the same time, the experiment on small-area fingerprints yielded an accuracy of 95.27%.

17.
Neural Netw ; 165: 809-829, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-37418863

RESUMEN

The past decade has witnessed significant progress in detecting objects by using enormous features of deep learning models. But, most of the existing models are unable to detect x-small and dense objects, due to the futility of feature extraction, and substantial misalignments between anchor boxes and axis-aligned convolution features, which leads to the discrepancy between the categorization score and positioning accuracy. This paper introduces an anchor regenerative-based transformer module in a feature refinement network to solve this problem. The anchor-regenerative module can generate anchor scales based on the semantic statistics of the objects present in the image, which avoids the inconsistency between the anchor boxes and axis-aligned convolution features. Whereas, the Multi-Head-Self-Attention (MHSA) based transformer module extracts the in-depth information from the feature maps based on the query, key, and value parameter information. This proposed model is experimentally verified on the VisDrone, VOC, and SKU-110K datasets. This model generates different anchor scales for these three datasets and achieves higher mAP, precision, and recall values on three datasets. These tested results prove that the suggested model has outstanding achievements compared with existing models in detecting x-small objects as well as dense objects. Finally, we evaluated the performance of these three datasets by using accuracy, kappa coefficient, and ROC metrics. These evaluated metrics demonstrate that our model is a good fit for VOC, and SKU-110K datasets.


Asunto(s)
Compuestos Orgánicos Volátiles , Benchmarking , Recuerdo Mental , Semántica , Percepción Visual
18.
BMC Med Imaging ; 23(1): 91, 2023 07 08.
Artículo en Inglés | MEDLINE | ID: mdl-37422639

RESUMEN

PURPOSE: Segmentation of liver vessels from CT images is indispensable prior to surgical planning and aroused a broad range of interest in the medical image analysis community. Due to the complex structure and low-contrast background, automatic liver vessel segmentation remains particularly challenging. Most of the related researches adopt FCN, U-net, and V-net variants as a backbone. However, these methods mainly focus on capturing multi-scale local features which may produce misclassified voxels due to the convolutional operator's limited locality reception field. METHODS: We propose a robust end-to-end vessel segmentation network called Inductive BIased Multi-Head Attention Vessel Net(IBIMHAV-Net) by expanding swin transformer to 3D and employing an effective combination of convolution and self-attention. In practice, we introduce voxel-wise embedding rather than patch-wise embedding to locate precise liver vessel voxels and adopt multi-scale convolutional operators to gain local spatial information. On the other hand, we propose the inductive biased multi-head self-attention which learns inductively biased relative positional embedding from initialized absolute position embedding. Based on this, we can gain more reliable queries and key matrices. RESULTS: We conducted experiments on the 3DIRCADb dataset. The average dice and sensitivity of the four tested cases were 74.8[Formula: see text] and 77.5[Formula: see text], which exceed the results of existing deep learning methods and improved graph cuts method. The Branches Detected(BD)/Tree-length Detected(TD) indexes also proved the global/local feature capture ability better than other methods. CONCLUSION: The proposed model IBIMHAV-Net provides an automatic, accurate 3D liver vessel segmentation with an interleaved architecture that better utilizes both global and local spatial features in CT volumes. It can be further extended for other clinical data.


Asunto(s)
Cabeza , Hígado , Humanos , Hígado/diagnóstico por imagen , Atención , Procesamiento de Imagen Asistido por Computador/métodos
19.
J King Saud Univ Comput Inf Sci ; 35(7): 101596, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37275558

RESUMEN

COVID-19 is a contagious disease that affects the human respiratory system. Infected individuals may develop serious illnesses, and complications may result in death. Using medical images to detect COVID-19 from essentially identical thoracic anomalies is challenging because it is time-consuming, laborious, and prone to human error. This study proposes an end-to-end deep-learning framework based on deep feature concatenation and a Multi-head Self-attention network. Feature concatenation involves fine-tuning the pre-trained backbone models of DenseNet, VGG-16, and InceptionV3, which are trained on a large-scale ImageNet, whereas a Multi-head Self-attention network is adopted for performance gain. End-to-end training and evaluation procedures are conducted using the COVID-19_Radiography_Dataset for binary and multi-classification scenarios. The proposed model achieved overall accuracies (96.33% and 98.67%) and F1_scores (92.68% and 98.67%) for multi and binary classification scenarios, respectively. In addition, this study highlights the difference in accuracy (98.0% vs. 96.33%) and F_1 score (97.34% vs. 95.10%) when compared with feature concatenation against the highest individual model performance. Furthermore, a virtual representation of the saliency maps of the employed attention mechanism focusing on the abnormal regions is presented using explainable artificial intelligence (XAI) technology. The proposed framework provided better COVID-19 prediction results outperforming other recent deep learning models using the same dataset.

20.
PeerJ Comput Sci ; 9: e1246, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37346669

RESUMEN

To extract finer-grained segment features from news and represent users accurately and exhaustively, this article develops a news recommendation (NR) model based on a sub-attention news encoder. First, by using convolutional neural network (CNN) and sub-attention mechanism, this model extracts a rich feature matrix from the news text. Then, from the perspective of image position and channel, the granular image data is retrieved. Next, the user's news browsing history is injected with a multi-head self-attention mechanism, and time series prediction is applied to the user's interests. Finally, the experimental results show that the proposed model performs well on the indicators: mean reciprocal rank (MRR), Normalized Discounted Cumulative Gain (NDCG) and area under the curve (AUC), with an average increase of 4.18%, 5.63% and 6.55%, respectively. The comparative results demonstrate that the model performs best on a variety of datasets and has fastest convergence speed in all cases. The proposed model may provide guidance for the design of the news recommendation system in the future.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA