Your browser doesn't support javascript.
loading
Two-Stream Mixed Convolutional Neural Network for American Sign Language Recognition.
Ma, Ying; Xu, Tianpei; Kim, Kangchul.
Afiliación
  • Ma Y; Department of Computer Engineering, Chonnam National University, Yeosu 59626, Korea.
  • Xu T; Department of Computer Engineering, Chonnam National University, Yeosu 59626, Korea.
  • Kim K; Department of Computer Engineering, Chonnam National University, Yeosu 59626, Korea.
Sensors (Basel) ; 22(16)2022 Aug 09.
Article en En | MEDLINE | ID: mdl-36015719
The Convolutional Neural Network (CNN) has demonstrated excellent performance in image recognition and has brought new opportunities for sign language recognition. However, the features undergo many nonlinear transformations while performing the convolutional operation and the traditional CNN models are insufficient in dealing with the correlation between images. In American Sign Language (ASL) recognition, J and Z with moving gestures bring recognition challenges. This paper proposes a novel Two-Stream Mixed (TSM) method with feature extraction and fusion operation to improve the correlation of feature expression between two time-consecutive images for the dynamic gestures. The proposed TSM-CNN system is composed of preprocessing, the TSM block, and CNN classifiers. Two consecutive images in the dynamic gesture are used as inputs of streams, and resizing, transformation, and augmentation are carried out in the preprocessing stage. The fusion feature map obtained by addition and concatenation in the TSM block is used as inputs of the classifiers. Finally, a classifier classifies images. The TSM-CNN model with the highest performance scores depending on three concatenation methods is selected as the definitive recognition model for ASL recognition. We design 4 CNN models with TSM: TSM-LeNet, TSM-AlexNet, TSM-ResNet18, and TSM-ResNet50. The experimental results show that the CNN models with the TSM are better than models without TSM. The TSM-ResNet50 has the best accuracy of 97.57% for MNIST and ASL datasets and is able to be applied to a RGB image sensing system for hearing-impaired people.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Lengua de Signos / Redes Neurales de la Computación Tipo de estudio: Prognostic_studies Límite: Humans Idioma: En Revista: Sensors (Basel) Año: 2022 Tipo del documento: Article Pais de publicación: Suiza

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Lengua de Signos / Redes Neurales de la Computación Tipo de estudio: Prognostic_studies Límite: Humans Idioma: En Revista: Sensors (Basel) Año: 2022 Tipo del documento: Article Pais de publicación: Suiza