Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 18.176
Filtrar
1.
Sensors (Basel) ; 24(17)2024 Aug 24.
Artículo en Inglés | MEDLINE | ID: mdl-39275411

RESUMEN

Gait recognition based on gait silhouette profiles is currently a major approach in the field of gait recognition. In previous studies, models typically used gait silhouette images sized at 64 × 64 pixels as input data. However, in practical applications, cases may arise where silhouette images are smaller than 64 × 64, leading to a loss in detail information and significantly affecting model accuracy. To address these challenges, we propose a gait recognition system named Multi-scale Feature Cross-Fusion Gait (MFCF-Gait). At the input stage of the model, we employ super-resolution algorithms to preprocess the data. During this process, we observed that different super-resolution algorithms applied to larger silhouette images also affect training outcomes. Improved super-resolution algorithms contribute to enhancing model performance. In terms of model architecture, we introduce a multi-scale feature cross-fusion network model. By integrating low-level feature information from higher-resolution images with high-level feature information from lower-resolution images, the model emphasizes smaller-scale details, thereby improving recognition accuracy for smaller silhouette images. The experimental results on the CASIA-B dataset demonstrate significant improvements. On 64 × 64 silhouette images, the accuracies for NM, BG, and CL states reached 96.49%, 91.42%, and 78.24%, respectively. On 32 × 32 silhouette images, the accuracies were 94.23%, 87.68%, and 71.57%, respectively, showing notable enhancements.


Asunto(s)
Algoritmos , Marcha , Marcha/fisiología , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos
2.
Sensors (Basel) ; 24(17)2024 Sep 02.
Artículo en Inglés | MEDLINE | ID: mdl-39275615

RESUMEN

Speech emotion recognition is key to many fields, including human-computer interaction, healthcare, and intelligent assistance. While acoustic features extracted from human speech are essential for this task, not all of them contribute to emotion recognition effectively. Thus, reduced numbers of features are required within successful emotion recognition models. This work aimed to investigate whether splitting the features into two subsets based on their distribution and then applying commonly used feature reduction methods would impact accuracy. Filter reduction was employed using the Kruskal-Wallis test, followed by principal component analysis (PCA) and independent component analysis (ICA). A set of features was investigated to determine whether the indiscriminate use of parametric feature reduction techniques affects the accuracy of emotion recognition. For this investigation, data from three databases-Berlin EmoDB, SAVEE, and RAVDES-were organized into subsets according to their distribution in applying both PCA and ICA. The results showed a reduction from 6373 features to 170 for the Berlin EmoDB database with an accuracy of 84.3%; a final size of 130 features for SAVEE, with a corresponding accuracy of 75.4%; and 150 features for RAVDESS, with an accuracy of 59.9%.


Asunto(s)
Emociones , Análisis de Componente Principal , Habla , Humanos , Emociones/fisiología , Habla/fisiología , Bases de Datos Factuales , Algoritmos , Reconocimiento de Normas Patrones Automatizadas/métodos
3.
Sensors (Basel) ; 24(17)2024 Sep 03.
Artículo en Inglés | MEDLINE | ID: mdl-39275635

RESUMEN

In this paper, we study facial expression recognition (FER) using three modalities obtained from a light field camera: sub-aperture (SA), depth map, and all-in-focus (AiF) images. Our objective is to construct a more comprehensive and effective FER system by investigating multimodal fusion strategies. For this purpose, we employ EfficientNetV2-S, pre-trained on AffectNet, as our primary convolutional neural network. This model, combined with a BiGRU, is used to process SA images. We evaluate various fusion techniques at both decision and feature levels to assess their effectiveness in enhancing FER accuracy. Our findings show that the model using SA images surpasses state-of-the-art performance, achieving 88.13% ± 7.42% accuracy under the subject-specific evaluation protocol and 91.88% ± 3.25% under the subject-independent evaluation protocol. These results highlight our model's potential in enhancing FER accuracy and robustness, outperforming existing methods. Furthermore, our multimodal fusion approach, integrating SA, AiF, and depth images, demonstrates substantial improvements over unimodal models. The decision-level fusion strategy, particularly using average weights, proved most effective, achieving 90.13% ± 4.95% accuracy under the subject-specific evaluation protocol and 93.33% ± 4.92% under the subject-independent evaluation protocol. This approach leverages the complementary strengths of each modality, resulting in a more comprehensive and accurate FER system.


Asunto(s)
Expresión Facial , Redes Neurales de la Computación , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Reconocimiento Facial Automatizado/métodos , Algoritmos , Reconocimiento de Normas Patrones Automatizadas/métodos
4.
Sensors (Basel) ; 24(17)2024 Sep 06.
Artículo en Inglés | MEDLINE | ID: mdl-39275707

RESUMEN

Emotion recognition through speech is a technique employed in various scenarios of Human-Computer Interaction (HCI). Existing approaches have achieved significant results; however, limitations persist, with the quantity and diversity of data being more notable when deep learning techniques are used. The lack of a standard in feature selection leads to continuous development and experimentation. Choosing and designing the appropriate network architecture constitutes another challenge. This study addresses the challenge of recognizing emotions in the human voice using deep learning techniques, proposing a comprehensive approach, and developing preprocessing and feature selection stages while constructing a dataset called EmoDSc as a result of combining several available databases. The synergy between spectral features and spectrogram images is investigated. Independently, the weighted accuracy obtained using only spectral features was 89%, while using only spectrogram images, the weighted accuracy reached 90%. These results, although surpassing previous research, highlight the strengths and limitations when operating in isolation. Based on this exploration, a neural network architecture composed of a CNN1D, a CNN2D, and an MLP that fuses spectral features and spectogram images is proposed. The model, supported by the unified dataset EmoDSc, demonstrates a remarkable accuracy of 96%.


Asunto(s)
Aprendizaje Profundo , Emociones , Redes Neurales de la Computación , Humanos , Emociones/fisiología , Habla/fisiología , Bases de Datos Factuales , Algoritmos , Reconocimiento de Normas Patrones Automatizadas/métodos
5.
Biomed Phys Eng Express ; 10(6)2024 Sep 12.
Artículo en Inglés | MEDLINE | ID: mdl-39231462

RESUMEN

Hand Movement Recognition (HMR) with sEMG is crucial for artificial hand prostheses. HMR performance mostly depends on the feature information that is fed to the classifiers. However, sEMG often captures noise like power line interference (PLI) and motion artifacts. This may extract redundant and insignificant feature information, which can degrade HMR performance and increase computational complexity. This study aims to address these issues by proposing a novel procedure for automatically removing PLI and motion artifacts from experimental sEMG signals. This will make it possible to extract better features from the signal and improve the categorization of various hand movements. Empirical mode decomposition and energy entropy thresholding are utilized to select relevant mode components for artifact removal. Time domain features are then used to train classifiers (kNN, LDA, SVM) for hand movement categorization, achieving average accuracies of 92.36%, 93.63%, and 98.12%, respectively, across subjects. Additionally, muscle contraction efforts are classified into low, medium, and high categories using this technique. Validation is performed on data from ten subjects performing eight hand movement classes and three muscle contraction efforts with three surface electrode channels. Results indicate that the proposed preprocessing improves average accuracy by 9.55% with the SVM classifier, significantly reducing computational time.


Asunto(s)
Algoritmos , Artefactos , Electromiografía , Mano , Movimiento , Reconocimiento de Normas Patrones Automatizadas , Procesamiento de Señales Asistido por Computador , Humanos , Electromiografía/métodos , Mano/fisiología , Reconocimiento de Normas Patrones Automatizadas/métodos , Masculino , Contracción Muscular , Adulto , Miembros Artificiales , Femenino , Movimiento (Física) , Músculo Esquelético/fisiología
6.
PLoS One ; 19(8): e0305118, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39208254

RESUMEN

In order to solve the problem of image quality and morphological characteristics of primary underglaze brown decorative pattern extraction, this paper proposes a method of primary underglaze brown decorative pattern extraction based on the coupling of single scale gamma correction and gray sharpening. The single-scale gamma correction is combined with the gray sharpening method. The single-scale gamma correction improves the contrast and brightness of the image by nonlinear transformation, but may lead to the loss of image detail. Gray sharpening can enhance the high frequency component and improve the clarity of the image, but it will introduce noise. Combining these two technologies can compensate for their shortcomings. The experimental results show that this method can improve the efficiency of last element underglaze brown decorative pattern extraction by enhancing the image retention detail and reducing the influence of noise. The experimental results showed that F1Score, Miou(%), Recall, Precision and Accuracy(%) were 0.92745, 0.82253, 0.97942, 0.92458 and 0.92745, respectively.


Asunto(s)
Algoritmos , Procesamiento de Imagen Asistido por Computador/métodos , Aumento de la Imagen/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos
7.
J Neural Eng ; 21(5)2024 Sep 03.
Artículo en Inglés | MEDLINE | ID: mdl-39178906

RESUMEN

Objective. The decline in the performance of electromyography (EMG)-based silent speech recognition is widely attributed to disparities in speech patterns, articulation habits, and individual physiology among speakers. Feature alignment by learning a discriminative network that resolves domain offsets across speakers is an effective method to address this problem. The prevailing adversarial network with a branching discriminator specializing in domain discrimination renders insufficiently direct contribution to categorical predictions of the classifier.Approach. To this end, we propose a simplified discrepancy-based adversarial network with a streamlined end-to-end structure for EMG-based cross-subject silent speech recognition. Highly aligned features across subjects are obtained by introducing a Nuclear-norm Wasserstein discrepancy metric on the back end of the classification network, which could be utilized for both classification and domain discrimination. Given the low-level and implicitly noisy nature of myoelectric signals, we devise a cascaded adaptive rectification network as the front-end feature extraction network, adaptively reshaping the intermediate feature map with automatically learnable channel-wise thresholds. The resulting features effectively filter out domain-specific information between subjects while retaining domain-invariant features critical for cross-subject recognition.Main results. A series of sentence-level classification experiments with 100 Chinese sentences demonstrate the efficacy of our method, achieving an average accuracy of 89.46% tested on 40 new subjects by training with data from 60 subjects. Especially, our method achieves a remarkable 10.07% improvement compared to the state-of-the-art model when tested on 10 new subjects with 20 subjects employed for training, surpassing its result even with three times training subjects.Significance. Our study demonstrates an improved classification performance of the proposed adversarial architecture using cross-subject myoelectric signals, providing a promising prospect for EMG-based speech interactive application.


Asunto(s)
Electromiografía , Humanos , Electromiografía/métodos , Masculino , Femenino , Redes Neurales de la Computación , Adulto , Software de Reconocimiento del Habla , Adulto Joven , Reconocimiento de Normas Patrones Automatizadas/métodos , Habla/fisiología
8.
Artículo en Inglés | MEDLINE | ID: mdl-39186426

RESUMEN

Hand motor impairment has seriously affected the daily life of the elderly. We developed an electromyography (EMG) exosuit system with bidirectional hand support for bilateral coordination assistance based on a dynamic gesture recognition model using graph convolutional network (GCN) and long short-term memory network (LSTM). The system included a hardware subsystem and a software subsystem. The hardware subsystem included an exosuit jacket, a backpack module, an EMG recognition module, and a bidirectional support glove. The software subsystem based on the dynamic gesture recognition model was designed to identify dynamic and static gestures by extracting the spatio-temporal features of the patient's EMG signals and to control glove movement. The offline training experiment built the gesture recognition models for each subject and evaluated the feasibility of the recognition model; the online control experiments verified the effectiveness of the exosuit system. The experimental results showed that the proposed model achieve a gesture recognition rate of 96.42% ± 3.26 %, which is higher than the other three traditional recognition models. All subjects successfully completed two daily tasks within a short time and the success rate of bilateral coordination assistance are 88.75% and 86.88%. The exosuit system can effectively help patients by bidirectional hand support strategy for bilateral coordination assistance in daily tasks, and the proposed method can be applied to various limb assistance scenarios.


Asunto(s)
Electromiografía , Gestos , Mano , Humanos , Mano/fisiología , Masculino , Femenino , Dispositivo Exoesqueleto , Adulto , Algoritmos , Redes Neurales de la Computación , Reconocimiento de Normas Patrones Automatizadas/métodos , Programas Informáticos , Actividades Cotidianas , Adulto Joven , Estudios de Factibilidad
9.
Sensors (Basel) ; 24(16)2024 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-39204927

RESUMEN

This study delves into decoding hand gestures using surface electromyography (EMG) signals collected via a precision Myo-armband sensor, leveraging machine learning algorithms. The research entails rigorous data preprocessing to extract features and labels from raw EMG data. Following partitioning into training and testing sets, four traditional machine learning models are scrutinized for their efficacy in classifying finger movements across seven distinct gestures. The analysis includes meticulous parameter optimization and five-fold cross-validation to evaluate model performance. Among the models assessed, the Random Forest emerges as the top performer, consistently delivering superior precision, recall, and F1-score values across gesture classes, with ROC-AUC scores surpassing 99%. These findings underscore the Random Forest model as the optimal classifier for our EMG dataset, promising significant advancements in healthcare rehabilitation engineering and enhancing human-computer interaction technologies.


Asunto(s)
Algoritmos , Electromiografía , Gestos , Mano , Aprendizaje Automático , Humanos , Electromiografía/métodos , Mano/fisiología , Masculino , Femenino , Adulto , Procesamiento de Señales Asistido por Computador , Adulto Joven , Reconocimiento de Normas Patrones Automatizadas/métodos , Movimiento/fisiología
10.
Sensors (Basel) ; 24(16)2024 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-39204983

RESUMEN

In cross-country skiing, ski poles play a crucial role in technique, propulsion, and overall performance. The kinematic parameters of ski poles can provide valuable information about the skier's technique, which is of great significance for coaches and athletes seeking to improve their skiing performance. In this work, a new smart ski pole is proposed, which combines the uniaxial load cell and the inertial measurement unit (IMU), aiming to provide comprehensive data measurement functions more easily and to play an auxiliary role in training. The ski pole can collect data directly related to skiing technical actions, such as the skier's pole force, pole angle, inertia data, etc., and the system's design, based on wireless transmission, makes the system more convenient to provide comprehensive data acquisition functions, in order to achieve a more simple and efficient use experience. In this experiment, the characteristic data obtained from the ski poles during the Double Poling of three skiers were extracted and the sample t-test was conducted. The results showed that the three skiers had significant differences in pole force, pole angle, and pole time. Spearman correlation analysis was used to analyze the sports data of the people with good performance, and the results showed that the pole force and speed (r = 0.71) and pole support angle (r = 0.76) were significantly correlated. In addition, this study adopted the commonly used inertial sensor data for action recognition, combined with the load cell data as the input of the ski technical action recognition algorithm, and the recognition accuracy of five kinds of cross-country skiing technical actions (Diagonal Stride (DS), Double Poling (DP), Kick Double Poling (KDP), Two-stroke Glide (G2) and Five-stroke Glide (G5)) reached 99.5%, and the accuracy was significantly improved compared with similar recognition systems. Therefore, the equipment is expected to be a valuable training tool for coaches and athletes, helping them to better understand and improve their ski maneuver technique.


Asunto(s)
Esquí , Esquí/fisiología , Humanos , Fenómenos Biomecánicos/fisiología , Reconocimiento de Normas Patrones Automatizadas/métodos , Rendimiento Atlético/fisiología
11.
Sensors (Basel) ; 24(16)2024 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-39205085

RESUMEN

In recent years, significant progress has been made in facial expression recognition methods. However, tasks related to facial expression recognition in real environments still require further research. This paper proposes a tri-cross-attention transformer with a multi-feature fusion network (TriCAFFNet) to improve facial expression recognition performance under challenging conditions. By combining LBP (Local Binary Pattern) features, HOG (Histogram of Oriented Gradients) features, landmark features, and CNN (convolutional neural network) features from facial images, the model is provided with a rich input to improve its ability to discern subtle differences between images. Additionally, tri-cross-attention blocks are designed to facilitate information exchange between different features, enabling mutual guidance among different features to capture salient attention. Extensive experiments on several widely used datasets show that our TriCAFFNet achieves the SOTA performance on RAF-DB with 92.17%, AffectNet (7 cls) with 67.40%, and AffectNet (8 cls) with 63.49%, respectively.


Asunto(s)
Expresión Facial , Redes Neurales de la Computación , Humanos , Algoritmos , Procesamiento de Imagen Asistido por Computador/métodos , Cara/anatomía & histología , Reconocimiento Facial Automatizado/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos
12.
Artículo en Inglés | MEDLINE | ID: mdl-39172614

RESUMEN

Surface electromyography (sEMG), a human-machine interface for gesture recognition, has shown promising potential for decoding motor intentions, but a variety of nonideal factors restrict its practical application in assistive robots. In this paper, we summarized the current mainstream gesture recognition strategies and proposed a gesture recognition method based on multimodal canonical correlation analysis feature fusion classification (MCAFC) for a nonideal condition that occurs in daily life, i.e., posture variations. The deep features of the sEMG and acceleration signals were first extracted via convolutional neural networks. A canonical correlation analysis was subsequently performed to associate the deep features of the two modalities. The transformed features were utilized as inputs to a linear discriminant analysis classifier to recognize the corresponding gestures. Both offline and real-time experiments were conducted on eight non-disabled subjects. The experimental results indicated that MCAFC achieved an average classification accuracy, average motion completion rate, and average motion completion time of 93.44%, 94.05%, and 1.38 s, respectively, with multiple dynamic postures, indicating significantly better performance than that of comparable methods. The results demonstrate the feasibility and superiority of the proposed multimodal signal feature fusion method for gesture recognition with posture variations, providing a new scheme for myoelectric control.


Asunto(s)
Algoritmos , Electromiografía , Gestos , Mano , Redes Neurales de la Computación , Reconocimiento de Normas Patrones Automatizadas , Postura , Humanos , Postura/fisiología , Mano/fisiología , Masculino , Reconocimiento de Normas Patrones Automatizadas/métodos , Adulto , Femenino , Adulto Joven , Análisis Discriminante , Aprendizaje Profundo , Voluntarios Sanos
13.
Neural Netw ; 179: 106573, 2024 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-39096753

RESUMEN

Recognizing expressions from dynamic facial videos can find more natural affect states of humans, and it becomes a more challenging task in real-world scenes due to pose variations of face, partial occlusions and subtle dynamic changes of emotion sequences. Existing transformer-based methods often focus on self-attention to model the global relations among spatial features or temporal features, which cannot well focus on important expression-related locality structures from both spatial and temporal features for the in-the-wild expression videos. To this end, we incorporate diverse graph structures into transformers and propose a CDGT method to construct diverse graph transformers for efficient emotion recognition from in-the-wild videos. Specifically, our method contains a spatial dual-graphs transformer and a temporal hyperbolic-graph transformer. The former deploys a dual-graph constrained attention to capture latent emotion-related graph geometry structures among local spatial tokens for efficient feature representation, especially for the video frames with pose variations and partial occlusions. The latter adopts a hyperbolic-graph constrained self-attention that explores important temporal graph structure information under hyperbolic space to model more subtle changes of dynamic emotion. Extensive experimental results on in-the-wild video-based facial expression databases show that our proposed CDGT outperforms other state-of-the-art methods.


Asunto(s)
Emociones , Expresión Facial , Grabación en Video , Humanos , Emociones/fisiología , Algoritmos , Redes Neurales de la Computación , Reconocimiento Facial/fisiología , Reconocimiento de Normas Patrones Automatizadas/métodos , Reconocimiento Facial Automatizado/métodos
14.
Neural Netw ; 179: 106622, 2024 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-39142175

RESUMEN

Dark video human action recognition has a wide range of applications in the real world. General action recognition methods focus on the actor or the action itself, ignoring the dark scene where the action happens, resulting in unsatisfied accuracy in recognition. For dark scenes, the existing two-step action recognition methods are stage complex due to introducing additional augmentation steps, and the one-step pipeline method is not lightweight enough. To address these issues, a one-step Transformer-based method named Dark Domain Shift for Action Recognition (Dark-DSAR) is proposed in this paper, which integrates the tasks of domain migration and classification into a single step and enhances the model's functional coherence with respect to these two tasks, making our Dark-DSAR has low computation but high accuracy. Specifically, the domain shift module (DSM) achieves domain adaption from dark to bright to reduce the number of parameters and the computational cost. Besides, we explore the matching relationship between the input video size and the model, which can further optimize the inference efficiency by removing the redundant information in videos through spatial resolution dropping. Extensive experiments have been conducted on the datasets of ARID1.5, HMDB51-Dark, and UAV-human-night. Results show that the proposed Dark-DSAR obtains the best Top-1 accuracy on ARID1.5 with 89.49%, which is 2.56% higher than the state-of-the-art method, 67.13% and 61.9% on HMDB51-Dark and UAV-human-night, respectively. In addition, ablation experiments reveal that the action classifiers can gain ≥1% in accuracy compared to the original model when equipped with our DSM.


Asunto(s)
Reconocimiento de Normas Patrones Automatizadas , Grabación en Video , Humanos , Reconocimiento de Normas Patrones Automatizadas/métodos , Redes Neurales de la Computación , Algoritmos , Oscuridad
15.
Med Image Anal ; 97: 103284, 2024 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-39096843

RESUMEN

The classic metaphyseal lesion (CML) is a unique fracture highly specific for infant abuse. This fracture is often subtle in radiographic appearance and commonly occurs in the distal tibia. The development of an automated model that can accurately identify distal tibial radiographs with CMLs is important to assist radiologists in detecting these fractures. However, building such a model typically requires a large and diverse training dataset. To address this problem, we propose a novel diffusion model for data augmentation called masked conditional diffusion model (MaC-DM). In contrast to previous generative models, our approach produces a wide range of realistic-appearing synthetic images of distal tibial radiographs along with their associated segmentation masks. MaC-DM achieves this by incorporating weighted segmentation masks of the distal tibias and CML fracture sites as image conditions for guidance. The augmented images produced by MaC-DM significantly enhance the performance of various commonly used classification models, accurately distinguishing normal distal tibial radiographs from those with CMLs. Additionally, it substantially improves the performance of different segmentation models, accurately labeling areas of the CMLs on distal tibial radiographs. Furthermore, MaC-DM can control the size of the CML fracture in the augmented images.


Asunto(s)
Algoritmos , Interpretación de Imagen Radiográfica Asistida por Computador , Sensibilidad y Especificidad , Fracturas de la Tibia , Humanos , Fracturas de la Tibia/diagnóstico por imagen , Interpretación de Imagen Radiográfica Asistida por Computador/métodos , Reproducibilidad de los Resultados , Intensificación de Imagen Radiográfica/métodos , Lactante , Reconocimiento de Normas Patrones Automatizadas/métodos , Maltrato a los Niños , Simulación por Computador
16.
Math Biosci Eng ; 21(7): 6631-6657, 2024 Jul 17.
Artículo en Inglés | MEDLINE | ID: mdl-39176412

RESUMEN

Facial emotion recognition (FER) is largely utilized to analyze human emotion in order to address the needs of many real-time applications such as computer-human interfaces, emotion detection, forensics, biometrics, and human-robot collaboration. Nonetheless, existing methods are mostly unable to offer correct predictions with a minimum error rate. In this paper, an innovative facial emotion recognition framework, termed extended walrus-based deep learning with Botox feature selection network (EWDL-BFSN), was designed to accurately detect facial emotions. The main goals of the EWDL-BFSN are to identify facial emotions automatically and effectively by choosing the optimal features and adjusting the hyperparameters of the classifier. The gradient wavelet anisotropic filter (GWAF) can be used for image pre-processing in the EWDL-BFSN model. Additionally, SqueezeNet is used to extract significant features. The improved Botox optimization algorithm (IBoA) is then used to choose the best features. Lastly, FER and classification are accomplished through the use of an enhanced optimization-based kernel residual 50 (EK-ResNet50) network. Meanwhile, a nature-inspired metaheuristic, walrus optimization algorithm (WOA) is utilized to pick the hyperparameters of EK-ResNet50 network model. The EWDL-BFSN model was trained and tested with publicly available CK+ and FER-2013 datasets. The Python platform was applied for implementation, and various performance metrics such as accuracy, sensitivity, specificity, and F1-score were analyzed with state-of-the-art methods. The proposed EWDL-BFSN model acquired an overall accuracy of 99.37 and 99.25% for both CK+ and FER-2013 datasets and proved its superiority in predicting facial emotions over state-of-the-art methods.


Asunto(s)
Algoritmos , Aprendizaje Profundo , Emociones , Expresión Facial , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Redes Neurales de la Computación , Bases de Datos Factuales , Reconocimiento de Normas Patrones Automatizadas/métodos , Cara , Reproducibilidad de los Resultados
17.
Brain Behav ; 14(8): e3519, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-39169422

RESUMEN

BACKGROUND: Neurological disorders pose a significant health challenge, and their early detection is critical for effective treatment planning and prognosis. Traditional classification of neural disorders based on causes, symptoms, developmental stage, severity, and nervous system effects has limitations. Leveraging artificial intelligence (AI) and machine learning (ML) for pattern recognition provides a potent solution to address these challenges. Therefore, this study focuses on proposing an innovative approach-the Aggregated Pattern Classification Method (APCM)-for precise identification of neural disorder stages. METHOD: The APCM was introduced to address prevalent issues in neural disorder detection, such as overfitting, robustness, and interoperability. This method utilizes aggregative patterns and classification learning functions to mitigate these challenges and enhance overall recognition accuracy, even in imbalanced data. The analysis involves neural images using observations from healthy individuals as a reference. Action response patterns from diverse inputs are mapped to identify similar features, establishing the disorder ratio. The stages are correlated based on available responses and associated neural data, with a preference for classification learning. This classification necessitates image and labeled data to prevent additional flaws in pattern recognition. Recognition and classification occur through multiple iterations, incorporating similar and diverse neural features. The learning process is finely tuned for minute classifications using labeled and unlabeled input data. RESULTS: The proposed APCM demonstrates notable achievements, with high pattern recognition (15.03%) and controlled classification errors (CEs) (10.61% less). The method effectively addresses overfitting, robustness, and interoperability issues, showcasing its potential as a powerful tool for detecting neural disorders at different stages. The ability to handle imbalanced data contributes to the overall success of the algorithm. CONCLUSION: The APCM emerges as a promising and effective approach for identifying precise neural disorder stages. By leveraging AI and ML, the method successfully resolves key challenges in pattern recognition. The high pattern recognition and reduced CEs underscore the method's potential for clinical applications. However, it is essential to acknowledge the reliance on high-quality neural image data, which may limit the generalizability of the approach. The proposed method allows future research to refine further and enhance its interpretability, providing valuable insights into neural disorder progression and underlying biological mechanisms.


Asunto(s)
Aprendizaje Automático , Humanos , Enfermedades del Sistema Nervioso/clasificación , Enfermedades del Sistema Nervioso/diagnóstico , Reconocimiento de Normas Patrones Automatizadas/métodos , Inteligencia Artificial
18.
Sensors (Basel) ; 24(15)2024 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-39123885

RESUMEN

Pattern recognition (PR)-based myoelectric control systems can naturally provide multifunctional and intuitive control of upper limb prostheses and restore lost limb function, but understanding their robustness remains an open scientific question. This study investigates how limb positions and electrode shifts-two factors that have been suggested to cause classification deterioration-affect classifiers' performance by quantifying changes in the class distribution using each factor as a class and computing the repeatability and modified separability indices. Ten intact-limb participants took part in the study. Linear discriminant analysis (LDA) was used as the classifier. The results confirmed previous studies that limb positions and electrode shifts deteriorate classification performance (14-21% decrease) with no difference between factors (p > 0.05). When considering limb positions and electrode shifts as classes, we could classify them with an accuracy of 96.13 ± 1.44% and 65.40 ± 8.23% for single and all motions, respectively. Testing on five amputees corroborated the above findings. We have demonstrated that each factor introduces changes in the feature space that are statistically new class instances. Thus, the feature space contains two statistically classifiable clusters when the same motion is collected in two different limb positions or electrode shifts. Our results are a step forward in understanding PR schemes' challenges for myoelectric control of prostheses and further validation needs be conducted on more amputee-related datasets.


Asunto(s)
Amputados , Miembros Artificiales , Electrodos , Electromiografía , Reconocimiento de Normas Patrones Automatizadas , Humanos , Electromiografía/métodos , Masculino , Adulto , Reconocimiento de Normas Patrones Automatizadas/métodos , Amputados/rehabilitación , Femenino , Análisis Discriminante , Adulto Joven , Extremidades/fisiología
19.
Sensors (Basel) ; 24(15)2024 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-39123896

RESUMEN

For successful human-robot collaboration, it is crucial to establish and sustain quality interaction between humans and robots, making it essential to facilitate human-robot interaction (HRI) effectively. The evolution of robot intelligence now enables robots to take a proactive role in initiating and sustaining HRI, thereby allowing humans to concentrate more on their primary tasks. In this paper, we introduce a system known as the Robot-Facilitated Interaction System (RFIS), where mobile robots are employed to perform identification, tracking, re-identification, and gesture recognition in an integrated framework to ensure anytime readiness for HRI. We implemented the RFIS on an autonomous mobile robot used for transporting a patient, to demonstrate proactive, real-time, and user-friendly interaction with a caretaker involved in monitoring and nursing the patient. In the implementation, we focused on the efficient and robust integration of various interaction facilitation modules within a real-time HRI system that operates in an edge computing environment. Experimental results show that the RFIS, as a comprehensive system integrating caretaker recognition, tracking, re-identification, and gesture recognition, can provide an overall high quality of interaction in HRI facilitation with average accuracies exceeding 90% during real-time operations at 5 FPS.


Asunto(s)
Gestos , Robótica , Robótica/métodos , Humanos , Reconocimiento de Normas Patrones Automatizadas/métodos , Algoritmos , Inteligencia Artificial
20.
Sensors (Basel) ; 24(15)2024 Jul 26.
Artículo en Inglés | MEDLINE | ID: mdl-39123907

RESUMEN

Skeleton-based action recognition, renowned for its computational efficiency and indifference to lighting variations, has become a focal point in the realm of motion analysis. However, most current methods typically only extract global skeleton features, overlooking the potential semantic relationships among various partial limb motions. For instance, the subtle differences between actions such as "brush teeth" and "brush hair" are mainly distinguished by specific elements. Although combining limb movements provides a more holistic representation of an action, relying solely on skeleton points proves inadequate for capturing these nuances. Therefore, integrating detailed linguistic descriptions into the learning process of skeleton features is essential. This motivates us to explore integrating fine-grained language descriptions into the learning process of skeleton features to capture more discriminative skeleton behavior representations. To this end, we introduce a new Linguistic-Driven Partial Semantic Relevance Learning framework (LPSR) in this work. While using state-of-the-art large language models to generate linguistic descriptions of local limb motions and further constrain the learning of local motions, we also aggregate global skeleton point representations and textual representations (which generated from an LLM) to obtain a more generalized cross-modal behavioral representation. On this basis, we propose a cyclic attentional interaction module to model the implicit correlations between partial limb motions. Numerous ablation experiments demonstrate the effectiveness of the method proposed in this paper, and our method also obtains state-of-the-art results.


Asunto(s)
Semántica , Humanos , Lingüística , Movimiento/fisiología , Reconocimiento de Normas Patrones Automatizadas/métodos , Algoritmos , Aprendizaje/fisiología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA