Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Sci Rep ; 14(1): 15310, 2024 07 03.
Artículo en Inglés | MEDLINE | ID: mdl-38961136

RESUMEN

Human activity recognition has a wide range of applications in various fields, such as video surveillance, virtual reality and human-computer intelligent interaction. It has emerged as a significant research area in computer vision. GCN (Graph Convolutional networks) have recently been widely used in these fields and have made great performance. However, there are still some challenges including over-smoothing problem caused by stack graph convolutions and deficient semantics correlation to capture the large movements between time sequences. Vision Transformer (ViT) is utilized in many 2D and 3D image fields and has surprised results. In our work, we propose a novel human activity recognition method based on ViT (HAR-ViT). We integrate enhanced AGCL (eAGCL) in 2s-AGCN to ViT to make it process spatio-temporal data (3D skeleton) and make full use of spatial features. The position encoder module orders the non-sequenced information while the transformer encoder efficiently compresses sequence data features to enhance calculation speed. Human activity recognition is accomplished through multi-layer perceptron (MLP) classifier. Experimental results demonstrate that the proposed method achieves SOTA performance on three extensively used datasets, NTU RGB+D 60, NTU RGB+D 120 and Kinetics-Skeleton 400.


Asunto(s)
Actividades Humanas , Humanos , Redes Neurales de la Computación , Algoritmos , Reconocimiento de Normas Patrones Automatizadas/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Imagenología Tridimensional/métodos
2.
Comput Biol Med ; 173: 108382, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38574530

RESUMEN

Research evidence shows that physical rehabilitation exercises prescribed by medical experts can assist in restoring physical function, improving life quality, and promoting independence for physically disabled individuals. In response to the absence of immediate expert feedback on performed actions, developing a Human Action Evaluation (HAE) system emerges as a valuable automated solution, addressing the need for accurate assessment of exercises and guidance during physical rehabilitation. Previous HAE systems developed for the rehabilitation exercises have focused on developing models that utilize skeleton data as input to compute a quality score for each action performed by the patient. However, existing studies have focused on improving scoring performance while often overlooking computational efficiency. In this research, we propose LightPRA (Light Physical Rehabilitation Assessment) system, an innovative architectural solution based on a Temporal Convolutional Network (TCN), which harnesses the capabilities of dilated causal Convolutional Neural Networks (CNNs). This approach efficiently captures complex temporal features and characteristics of the skeleton data with lower computational complexity, making it suitable for real-time feedback provided on resource-constrained devices such as Internet of Things (IoT) devices and Edge computing frameworks. Through empirical analysis performed on the University of Idaho-Physical Rehabilitation Movement Data (UI-PRMD) and KInematic assessment of MOvement for remote monitoring of physical REhabilitation (KIMORE) datasets, our proposed LightPRA model demonstrates superior performance over several state-of-the-art approaches such as Spatial-Temporal Graph Convolutional Network (STGCN) and Long Short-Term Memory (LSTM)-based models in scoring human activity performance, while exhibiting lower computational cost and complexity.


Asunto(s)
Terapia por Ejercicio , Medicina , Humanos , Ejercicio Físico , Movimiento , Redes Neurales de la Computación , Radiofármacos
3.
Biomimetics (Basel) ; 9(3)2024 Feb 20.
Artículo en Inglés | MEDLINE | ID: mdl-38534808

RESUMEN

Skeleton-based human interaction recognition is a challenging task in the field of vision and image processing. Graph Convolutional Networks (GCNs) achieved remarkable performance by modeling the human skeleton as a topology. However, existing GCN-based methods have two problems: (1) Existing frameworks cannot effectively take advantage of the complementary features of different skeletal modalities. There is no information transfer channel between various specific modalities. (2) Limited by the structure of the skeleton topology, it is hard to capture and learn the information about two-person interactions. To solve these problems, inspired by the human visual neural network, we propose a multi-modal enhancement transformer (ME-Former) network for skeleton-based human interaction recognition. ME-Former includes a multi-modal enhancement module (ME) and a context progressive fusion block (CPF). More specifically, each ME module consists of a multi-head cross-modal attention block (MH-CA) and a two-person hypergraph self-attention block (TH-SA), which are responsible for enhancing the skeleton features of a specific modality from other skeletal modalities and modeling spatial dependencies between joints using the specific modality, respectively. In addition, we propose a two-person skeleton topology and a two-person hypergraph representation. The TH-SA block can embed their structural information into the self-attention to better learn two-person interaction. The CPF block is capable of progressively transforming the features of different skeletal modalities from low-level features to higher-order global contexts, making the enhancement process more efficient. Extensive experiments on benchmark NTU-RGB+D 60 and NTU-RGB+D 120 datasets consistently verify the effectiveness of our proposed ME-Former by outperforming state-of-the-art methods.

4.
Sensors (Basel) ; 24(6)2024 Mar 16.
Artículo en Inglés | MEDLINE | ID: mdl-38544172

RESUMEN

Physical exercise affects many facets of life, including mental health, social interaction, physical fitness, and illness prevention, among many others. Therefore, several AI-driven techniques have been developed in the literature to recognize human physical activities. However, these techniques fail to adequately learn the temporal and spatial features of the data patterns. Additionally, these techniques are unable to fully comprehend complex activity patterns over different periods, emphasizing the need for enhanced architectures to further increase accuracy by learning spatiotemporal dependencies in the data individually. Therefore, in this work, we develop an attention-enhanced dual-stream network (PAR-Net) for physical activity recognition with the ability to extract both spatial and temporal features simultaneously. The PAR-Net integrates convolutional neural networks (CNNs) and echo state networks (ESNs), followed by a self-attention mechanism for optimal feature selection. The dual-stream feature extraction mechanism enables the PAR-Net to learn spatiotemporal dependencies from actual data. Furthermore, the incorporation of a self-attention mechanism makes a substantial contribution by facilitating targeted attention on significant features, hence enhancing the identification of nuanced activity patterns. The PAR-Net was evaluated on two benchmark physical activity recognition datasets and achieved higher performance by surpassing the baselines comparatively. Additionally, a thorough ablation study was conducted to determine the best optimal model for human physical activity recognition.


Asunto(s)
Aprendizaje Automático , Redes Neurales de la Computación , Humanos , Actividades Humanas , Reconocimiento en Psicología , Ejercicio Físico
5.
Sensors (Basel) ; 24(6)2024 Mar 18.
Artículo en Inglés | MEDLINE | ID: mdl-38544204

RESUMEN

The advancement of deep learning in human activity recognition (HAR) using 3D skeleton data is critical for applications in healthcare, security, sports, and human-computer interaction. This paper tackles a well-known gap in the field, which is the lack of testing in the applicability and reliability of XAI evaluation metrics in the skeleton-based HAR domain. We have tested established XAI metrics, namely faithfulness and stability on Class Activation Mapping (CAM) and Gradient-weighted Class Activation Mapping (Grad-CAM) to address this problem. This study introduces a perturbation method that produces variations within the error tolerance of motion sensor tracking, ensuring the resultant skeletal data points remain within the plausible output range of human movement as captured by the tracking device. We used the NTU RGB+D 60 dataset and the EfficientGCN architecture for HAR model training and testing. The evaluation involved systematically perturbing the 3D skeleton data by applying controlled displacements at different magnitudes to assess the impact on XAI metric performance across multiple action classes. Our findings reveal that faithfulness may not consistently serve as a reliable metric across all classes for the EfficientGCN model, indicating its limited applicability in certain contexts. In contrast, stability proves to be a more robust metric, showing dependability across different perturbation magnitudes. Additionally, CAM and Grad-CAM yielded almost identical explanations, leading to closely similar metric outcomes. This suggests a need for the exploration of additional metrics and the application of more diverse XAI methods to broaden the understanding and effectiveness of XAI in skeleton-based HAR.


Asunto(s)
Sistema Musculoesquelético , Humanos , Reproducibilidad de los Resultados , Movimiento , Esqueleto , Actividades Humanas
6.
Sensors (Basel) ; 23(23)2023 Nov 23.
Artículo en Inglés | MEDLINE | ID: mdl-38067723

RESUMEN

The global concern regarding the monitoring of construction workers' activities necessitates an efficient means of continuous monitoring for timely action recognition at construction sites. This paper introduces a novel approach-the multi-scale graph strategy-to enhance feature extraction in complex networks. At the core of this strategy lies the multi-feature fusion network (MF-Net), which employs multiple scale graphs in distinct network streams to capture both local and global features of crucial joints. This approach extends beyond local relationships to encompass broader connections, including those between the head and foot, as well as interactions like those involving the head and neck. By integrating diverse scale graphs into distinct network streams, we effectively incorporate physically unrelated information, aiding in the extraction of vital local joint contour features. Furthermore, we introduce velocity and acceleration as temporal features, fusing them with spatial features to enhance informational efficacy and the model's performance. Finally, efficiency-enhancing measures, such as a bottleneck structure and a branch-wise attention block, are implemented to optimize computational resources while enhancing feature discriminability. The significance of this paper lies in improving the management model of the construction industry, ultimately aiming to enhance the health and work efficiency of workers.


Asunto(s)
Industria de la Construcción , Sistema Musculoesquelético , Humanos , Esqueleto , Pie , Extremidad Inferior
7.
Front Bioeng Biotechnol ; 11: 1191868, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37409167

RESUMEN

Introduction: Balance impairment is an important indicator to a variety of diseases. Early detection of balance impairment enables doctors to provide timely treatments to patients, thus reduce their fall risk and prevent related disease progression. Currently, balance abilities are usually assessed by balance scales, which depend heavily on the subjective judgement of assessors. Methods: To address this issue, we specifically designed a method combining 3D skeleton data and deep convolutional neural network (DCNN) for automated balance abilities assessment during walking. A 3D skeleton dataset with three standardized balance ability levels were collected and used to establish the proposed method. To obtain better performance, different skeleton-node selections and different DCNN hyperparameters setting were compared. Leave-one-subject-out-cross-validation was used in training and validation of the networks. Results and Discussion: Results showed that the proposed deep learning method was able to achieve 93.33% accuracy, 94.44% precision and 94.46% F1 score, which outperformed four other commonly used machine learning methods and CNN-based methods. We also found that data from body trunk and lower limbs are the most important while data from upper limbs may reduce model accuracy. To further validate the performance of the proposed method, we migrated and applied a state-of-the-art posture classification method to the walking balance ability assessment task. Results showed that the proposed DCNN model improved the accuracy of walking balance ability assessment. Layer-wise Relevance Propagation (LRP) was used to interpret the output of the proposed DCNN model. Our results suggest that DCNN classifier is a fast and accurate method for balance assessment during walking.

8.
Sensors (Basel) ; 23(14)2023 Jul 14.
Artículo en Inglés | MEDLINE | ID: mdl-37514691

RESUMEN

Graph convolutional networks (GCNs), which extend convolutional neural networks (CNNs) to non-Euclidean structures, have been utilized to promote skeleton-based human action recognition research and have made substantial progress in doing so. However, there are still some challenges in the construction of recognition models based on GCNs. In this paper, we propose an enhanced adjacency matrix-based graph convolutional network with a combinatorial attention mechanism (CA-EAMGCN) for skeleton-based action recognition. Firstly, an enhanced adjacency matrix is constructed to expand the model's perceptive field of global node features. Secondly, a feature selection fusion module (FSFM) is designed to provide an optimal fusion ratio for multiple input features of the model. Finally, a combinatorial attention mechanism is devised. Specifically, our spatial-temporal (ST) attention module and limb attention module (LAM) are integrated into a multi-input branch and a mainstream network of the proposed model, respectively. Extensive experiments on three large-scale datasets, namely the NTU RGB+D 60, NTU RGB+D 120 and UAV-Human datasets, show that the proposed model takes into account both requirements of light weight and recognition accuracy. This demonstrates the effectiveness of our method.

9.
Comput Biol Med ; 158: 106835, 2023 05.
Artículo en Inglés | MEDLINE | ID: mdl-37019012

RESUMEN

Performing prescribed physical exercises during home-based rehabilitation programs plays an important role in regaining muscle strength and improving balance for people with different physical disabilities. However, patients attending these programs are not able to assess their action performance in the absence of a medical expert. Recently, vision-based sensors have been deployed in the activity monitoring domain. They are capable of capturing accurate skeleton data. Furthermore, there have been significant advancements in Computer Vision (CV) and Deep Learning (DL) methodologies. These factors have promoted the solutions for designing automatic patient's activity monitoring models. Then, improving such systems' performance to assist patients and physiotherapists has attracted wide interest of the research community. This paper provides a comprehensive and up-to-date literature review on different stages of skeleton data acquisition processes for the aim of physio exercise monitoring. Then, the previously reported Artificial Intelligence (AI) - based methodologies for skeleton data analysis will be reviewed. In particular, feature learning from skeleton data, evaluation, and feedback generation for the purpose of rehabilitation monitoring will be studied. Furthermore, the associated challenges to these processes will be reviewed. Finally, the paper puts forward several suggestions for future research directions in this area.


Asunto(s)
Inteligencia Artificial , Ejercicio Físico , Humanos , Visión Ocular , Monitoreo Fisiológico , Esqueleto
10.
Sensors (Basel) ; 23(5)2023 Feb 22.
Artículo en Inglés | MEDLINE | ID: mdl-36904656

RESUMEN

Human action recognition has drawn significant attention because of its importance in computer vision-based applications. Action recognition based on skeleton sequences has rapidly advanced in the last decade. Conventional deep learning-based approaches are based on extracting skeleton sequences through convolutional operations. Most of these architectures are implemented by learning spatial and temporal features through multiple streams. These studies have enlightened the action recognition endeavor from various algorithmic angles. However, three common issues are observed: (1) The models are usually complicated; therefore, they have a correspondingly higher computational complexity. (2) For supervised learning models, the reliance on labels during training is always a drawback. (3) Implementing large models is not beneficial to real-time applications. To address the above issues, in this paper, we propose a multi-layer perceptron (MLP)-based self-supervised learning framework with a contrastive learning loss function (ConMLP). ConMLP does not require a massive computational setup; it can effectively reduce the consumption of computational resources. Compared with supervised learning frameworks, ConMLP is friendly to the huge amount of unlabeled training data. In addition, it has low requirements for system configuration and is more conducive to being embedded in real-world applications. Extensive experiments show that ConMLP achieves the top one inference result of 96.9% on the NTU RGB+D dataset. This accuracy is higher than the state-of-the-art self-supervised learning method. Meanwhile, ConMLP is also evaluated in a supervised learning manner, which has achieved comparable performance to the state of the art of recognition accuracy.

11.
Sensors (Basel) ; 23(5)2023 Mar 03.
Artículo en Inglés | MEDLINE | ID: mdl-36904990

RESUMEN

Because of societal changes, human activity recognition, part of home care systems, has become increasingly important. Camera-based recognition is mainstream but has privacy concerns and is less accurate under dim lighting. In contrast, radar sensors do not record sensitive information, avoid the invasion of privacy, and work in poor lighting. However, the collected data are often sparse. To address this issue, we propose a novel Multimodal Two-stream GNN Framework for Efficient Point Cloud and Skeleton Data Alignment (MTGEA), which improves recognition accuracy through accurate skeletal features from Kinect models. We first collected two datasets using the mmWave radar and Kinect v4 sensors. Then, we used zero-padding, Gaussian Noise (GN), and Agglomerative Hierarchical Clustering (AHC) to increase the number of collected point clouds to 25 per frame to match the skeleton data. Second, we used Spatial Temporal Graph Convolutional Network (ST-GCN) architecture to acquire multimodal representations in the spatio-temporal domain focusing on skeletal features. Finally, we implemented an attention mechanism aligning the two multimodal features to capture the correlation between point clouds and skeleton data. The resulting model was evaluated empirically on human activity data and shown to improve human activity recognition with radar data only. All datasets and codes are available in our GitHub.

12.
Sensors (Basel) ; 22(1)2022 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-35009865

RESUMEN

In recent years, Human Activity Recognition (HAR) has become one of the most important research topics in the domains of health and human-machine interaction. Many Artificial intelligence-based models are developed for activity recognition; however, these algorithms fail to extract spatial and temporal features due to which they show poor performance on real-world long-term HAR. Furthermore, in literature, a limited number of datasets are publicly available for physical activities recognition that contains less number of activities. Considering these limitations, we develop a hybrid model by incorporating Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) for activity recognition where CNN is used for spatial features extraction and LSTM network is utilized for learning temporal information. Additionally, a new challenging dataset is generated that is collected from 20 participants using the Kinect V2 sensor and contains 12 different classes of human physical activities. An extensive ablation study is performed over different traditional machine learning and deep learning models to obtain the optimum solution for HAR. The accuracy of 90.89% is achieved via the CNN-LSTM technique, which shows that the proposed model is suitable for HAR applications.


Asunto(s)
Aprendizaje Profundo , Inteligencia Artificial , Actividades Humanas , Humanos , Aprendizaje Automático , Redes Neurales de la Computación
13.
Sensors (Basel) ; 21(2)2021 Jan 08.
Artículo en Inglés | MEDLINE | ID: mdl-33430118

RESUMEN

The recognition of stereotyped action is one of the core diagnostic criteria of Autism Spectrum Disorder (ASD). However, it mainly relies on parent interviews and clinical observations, which lead to a long diagnosis cycle and prevents the ASD children from timely treatment. To speed up the recognition process of stereotyped actions, a method based on skeleton data and Long Short-Term Memory (LSTM) is proposed in this paper. In the first stage of our method, the OpenPose algorithm is used to obtain the initial skeleton data from the video of ASD children. Furthermore, four denoising methods are proposed to eliminate the noise of the initial skeleton data. In the second stage, we track multiple ASD children in the same scene by matching distance between current skeletons and previous skeletons. In the last stage, the neural network based on LSTM is proposed to classify the ASD children's actions. The performed experiments show that our proposed method is effective for ASD children's action recognition. Compared to the previous traditional schemes, our scheme has higher accuracy and is almost non-invasive for ASD children.


Asunto(s)
Trastorno del Espectro Autista , Trastorno del Espectro Autista/diagnóstico , Niño , Humanos , Memoria a Largo Plazo , Memoria a Corto Plazo , Reconocimiento en Psicología , Esqueleto
14.
Sensors (Basel) ; 17(5)2017 May 11.
Artículo en Inglés | MEDLINE | ID: mdl-28492486

RESUMEN

Human activity recognition is an important area in computer vision, with its wide range of applications including ambient assisted living. In this paper, an activity recognition system based on skeleton data extracted from a depth camera is presented. The system makes use of machine learning techniques to classify the actions that are described with a set of a few basic postures. The training phase creates several models related to the number of clustered postures by means of a multiclass Support Vector Machine (SVM), trained with Sequential Minimal Optimization (SMO). The classification phase adopts the X-means algorithm to find the optimal number of clusters dynamically. The contribution of the paper is twofold. The first aim is to perform activity recognition employing features based on a small number of informative postures, extracted independently from each activity instance; secondly, it aims to assess the minimum number of frames needed for an adequate classification. The system is evaluated on two publicly available datasets, the Cornell Activity Dataset (CAD-60) and the Telecommunication Systems Team (TST) Fall detection dataset. The number of clusters needed to model each instance ranges from two to four elements. The proposed approach reaches excellent performances using only about 4 s of input data (~100 frames) and outperforms the state of the art when it uses approximately 500 frames on the CAD-60 dataset. The results are promising for the test in real context.


Asunto(s)
Actividades Humanas , Algoritmos , Análisis por Conglomerados , Humanos , Aprendizaje Automático , Esqueleto , Máquina de Vectores de Soporte
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA