A Comprehensive Review of Vietnamese Sign Language Recognition Techniques
Main Article Content
Abstract
The paper presents a systematic quantitative literature review of Vietnamese Sign Language recognition techniques developed between 2015 and 2025. VSL recognition plays a vital role in bridging communication gaps and enhancing accessibility for the deaf and hard-of-hearing community in Vietnam. To identify and synthesize current trends and challenges, we conducted a structured search and screening process across major academic databases. These works were analyzed based on recognition approach (e.g., computer vision, wearable sensors, data-driven methods, and multimodal data fusion), datasets used, feature extraction strategies, classification models, and performance metrics. Descriptive statistics were used to map the evolution of methods over time, while comparative analyses highlighted the strengths and limitations of different techniques across real-time and static recognition tasks. Our findings indicate a growing shift towards deep learning and sensor fusion methods, though limitations persist in dataset availability, model generalizability, and real-world deployment. This review provides critical insights into current research gaps and offers guidance for future work on scalable, culturally adaptive VSL recognition systems.
Keywords
Computer vision, machine learning, sign language recognition, Vietnamese sign language.
Article Details
References
[2] K. Nguyen-Trong, H. N. Vu, N. N. Trung, and C. Pham, Gesture recognition using wearable sensors with Bi-Long short-term memory convolutional neural networks, IEEE Sensors Journal, vol. 21, no. 13, pp. 15065–15079, 2021. https://doi.org/10.1109/JSEN.2021.3074642
[3] L. T. Phi, H. D. Nguyen, T. Q. Bui, and T. T. Vu, A glove-based gesture recognition system for Vietnamese sign language, in 15th International Conference on Control Automation and Systems, Busan, 2015. https://doi.org/10.1109/ICCAS.2015.7364604
[4] A. H. Vo, V.-H. Pham, and B. T. Nguyen, Deep learning for Vietnamese sign language recognition in video sequence, International Journal of Machine Learning and Computing, vol. 9, no. 4, 2019. https://doi.org/10.18178/ijmlc.2019.9.4.823
[5] Q. P. Van and B. N. Thanh, Vietnamese sign language recognition using dynamic object extraction and deep learning, in IEEE Eighth International Conference on Communications and Electronics (ICCE), Phu Quoc Island, 2020.
[6] H.-Q. Nguyen, T.-H. Le, T.-K. Tran, H.-N. Tran, T.-H. Tran, T.-L. Le, H. Vu, C. Pham, T. P. Nguyen, and H. T. Nguyen, Hand gesture recognition from wrist-worn camera for human–machine interaction, IEEE Access, vol. 11, pp. 53262–53274, 2023. https://doi.org/10.1109/ACCESS.2023.3279845
[7] K. H. V. Nguyen, A.-D. Phan, T. B. Minh, T. T. T. Phan, and X. P. Do, Gesture recognition model with multi-tracking capture system for human-robot interaction, in International Conference on System Science and Engineering (ICSSE), 2023. https://doi.org/10.1109/ICSSE58758.2023.10227183
[8] D. H. Vo, H. H. Huynh, T. N. Nguyen, and J. Meunier, Automatic hand gesture segmentation for recognition of Vietnamese sign language, in ACM International Conference Proceeding Series, Association for Computing Machinery, Dec. 2016, pp. 368–373. https://doi.org/10.1145/3011077.3011135
[9] V. D. Nguyen, M. T. Chew, and S. Demidenko, Vietnamese sign language reader using intel creative Senz3D, in 2015 6th International Conference on Automation, Robotics and Applications, IEEE, Queenstown, New Zealand, Apr. 2015, pp. 77–82
https://doi.org/10.1109/ICARA.2015.7081128
[10] D.-H. Vo, T.-N. Nguyen, H.-H. Huynh, and J. Meunier, Recognizing Vietnamese sign language based on rank matrix and alphabetic rules, in International Conference on Advanced Technologies for Communications (ATC), Ho Chi Minh City, 2015.
[11] D. H. Vo, H. H. Huynh, P. M. Doan, and J. Meunier, Dynamic gesture classification for Vietnamese sign language recognition, (IJACSA) International Journal of Advanced Computer Science and Applications, vol. 8, no. 3, 2017. https://doi.org/10.14569/IJACSA.2017.080357
[12] P. T. Hai, H. C. Thinh, B. Van Phuc, and H. H. Kha, Automatic feature extraction for Vietnamese sign language recognition using support vector machine, in Proceedings - 2018 2nd International Conference on Recent Advances in Signal Processing, Telecommunications and Computing, SIGTELCOM 2018, Institute of Electrical and Electronics Engineers Inc., Mar. 2018, pp. 146–151. https://doi.org/10.1109/SIGTELCOM.2018.8325780
[13] C. M. Jin, Z. Omar, and M. H. Jaward, A mobile application of American sign language translation via image processing algorithms, in IEEE Region 10 Symposium (TENSYMP), Bali, 2016. https://doi.org/10.1109/TENCONSpring.2016.7519386
[14] L. D. Quach and C.-N. Nguyen, Conversion of the Vietnamese grammar into sign language structure using the example-based machine translation algorithm conversion of the Vietnamese grammar into sign language structure using the example-based machine translation algorithm in International Conference on Advanced Technologies for Communications (ATC), 2018. https://doi.org/10.1109/ATC.2018.8587584
[15] H.-N. Vu, T. Hoang, C. Tran and C. Pham, Sign language recognition with self-learning fusion model, IEEE Sensors Journal, vol. 23, no. 22, pp. 27828–27840, 2023. https://doi.org/10.1109/JSEN.2023.3314728
[16] A. H. Vo, N. T. Q. Nguyen, N. T. B. Nguyen, H. V. Pham, and B. T. Nguyen, Video-based Vietnamese sign language recognition using local descriptors, intelligent information and database systems, vol. 11432, 2019.
[17] N. H. Phat and N. T. M. Anh, Vietnamese text classification algorithm using long short term memory and WORD2VEC, Informatics and Automation, vol. 19, no. 6, pp. 1255–1279, Dec. 2020. [In Russia]: Алгоритм классификации вьетнамского текста с использованием долгой краткосрочной памяти и Word2Vec, Искусственный интеллект, инженерия данных и знаний. https://doi.org/10.15625/1813-9663/18025
[18] Dinh, S. N., Tran, T. D., Pham, H. N., Tran, H. T., Tong, A. N., Hoang, H. Q., and Nguyen, L. P., Sign language recognition: a large-scale multi-view dataset and comprehensive evaluation, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 7876–7886, 2025. https://doi.org/10.1109/WACV61041.2025.00766
[19] T. B. D. Nguyen, T. N. Phung, and T. T. Vu, A study of data augmentation and accuracy improvement in machine translation for Vietnamese sign language, Journal of Computer Science and Cybernetics, vol. 39, no. 2, 2023. https://doi.org/10.15625/1813-9663/18025
[20] T. D. Ngo, D. H. L. Nguyen, and H. L. Luong, Sign language representation using virtual characters with 3D animation, VNU Journal of Science: Computer Science and Communication Engineering, vol. 41, no. 1, pp. 54–68, 2025. https://doi.org/10.25073/2588-1086/vnucsce.3768
[21] T.-B.-D. Nguyen and T.-T. Nguyen, Rule-based machine translation for the automatic translation of Vietnamese sign language, International Journal of Language and Linguistics, vol. 11, iss. 6, Dec. 2023. https://doi.org/10.11648/j.ijll.20231106.12
[22] P. N. Huu, T. L. Ngoc, and Q. T. Minh, Proposing gesture recognition algorithm using two-stream convolutional network and LSTM, in International Conference on Communications and Electronics (ICCE), Phu Quoc Island, 2021. https://doi.org/10.1109/ICCE48956.2021.9352147
[23] P. N. Huu and H. N. T. Thu, Proposal gesture recognition algorithm combining CNN for health monitoring, in Proceedings - 2019 6th NAFOSTED Conference on Information and Computer Science, NICS 2019, 2019. https://doi.org/10.1109/NICS48868.2019.9023804
[24] Hoai, N. V., and Anh, D. T., Diffusion-guided graph convolutional networks for sign language recognition, Signal, Image and Video Processing, 2025. https://doi.org/10.1007/s11760-025-04007-9
[25] T. T. D. Nguyen, T. T. N. Do, Q. H. Hoang, P. Le Nguyen, and A. V. Tran, M3-SLR: self-supervised pretraining with maxflow maskfeat for improved multi-view sign language representation, IEEE Access, vol. 13, pp. 148170–148191, 2025. https://doi.org/10.1109/ACCESS.2025.3601235
[26] P. N. Huu and H. L. The, Proposing recognition algorithms for hand gestures based on machine learning model, in Proceedings - 2019 19th International Symposium on Communications and Information Technologies, ISCIT 2019, Ho Chi Minh City, Vietnam, Sep. 2019.
https://doi.org/10.1109/ISCIT.2019.8905194
[27] P. N. Huu, Q. T. Minh, and H. L. The, An ANN-based gesture recognition algorithm for smart-home applications, KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, vol. 14, no. 5, pp. 1967–1983, 2020. https://doi.org/10.3837/tiis.2020.05.006
[28] D.-T. Pham, A two-Stream graph convolutional network for dynamic hand gesture recognition, in Advances in Data Science and Optimization of Complex Systems, H. M. and N. Q. T. Le Thi Hoai An and Le, Ed., Cham: Springer Nature Switzerland, 2025, pp. 288–297.
https://doi.org/10.1007/978-3-032-00267-9_26
[29] K. D. Bach, P. T. Duong, P. T. T. Ha, B. N. Anh, and N. T. son, Vietnamese sign language detection using mediapipe, in Proceedings of the 2021 10th International Conference on Software and Computer Applications (ICSCA), Kuala Lumpur, 2021.
[30] P. N. Huu, H. N. T. Thu, and Q. T. Minh, Proposing a recognition system of gestures using mobilenetV2 combining single shot detector network for smart-home applications, Journal of Electrical and Computer Engineering, vol. 2021, 2021. https://doi.org/10.1155/2021/6610461
[31] Dang Khanh, Bessmertny I. A. ViSL one-shot: generating Vietnamese sign language data set. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, vol. 24, no. 2, pp. 241–248, 2024. https://doi.org/10.17586/2226-1494-2024-24-2-241-248
[32] Dang Kh., Bessmertny I. A. ViSL model: the model automatically generates sentences of Vietnamese sign language, Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2024, vol. 24, no. 5, pp. 779–787. https://doi.org/10.17586/2226-1494-2024-24-5-779-787
[33] Xuan-Phuoc Nguyen, Thi-Huong Nguyen, Duc-Tan Tran, Tien-Son Bui, and Van-Toi Nguyen, An isolated Vietnamese sign language recognition method using a fusion of heatmap and depth information based on convolutional neural networks, Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2024.
https://doi.org/10.1109/APSIPAASC63619.2025.10848961
[34] V. Tran, V. K. Phung, Q. H. Hoang, and T. V. H. Pham, Vietnamese Sign Language Alphabet Recognition Using Deep Learning and Mediapipe Methods, Smart Systems and Devices, vol. 35, no. 1, pp. 10–19, Jan. 2025. https://doi.org/10.51316/jst.179.ssad.2025.35.1.2