Multi-Task Learning for Age, Gender, and Emotion Recognition on Edge Processing

Ha Xuan  Nguyen; An Dao; Duc Quang Tran; Minh Tuan Dang

doi:10.51316/jst.184.ssad.2025.35.3.2

pdf

Date Published: 15/09/2025

Abstract Views: 70
Views pdf: 8

DOI: 10.51316/jst.184.ssad.2025.35.3.2

Issue

Vol. 35 No. 3 (2025): JST: Smart Systems and Devices (09/2025)

Section

Research article

How to Cite

Nguyen, H. X., Dao, A., Tran, D. Q., & Dang, M. T. (2025). Multi-Task Learning for Age, Gender, and Emotion Recognition on Edge Processing. Smart Systems and Devices, 35(3), 009-015. https://doi.org/10.51316/jst.184.ssad.2025.35.3.2

Citation format:

Multi-Task Learning for Age, Gender, and Emotion Recognition on Edge Processing

Ha Xuan Nguyen^1,2,, An Dao², Duc Quang Tran², Minh Tuan Dang³
¹ Hanoi University of Science and Technology, Ha Noi, Vietnam
² CMC Applied Technology Institute, CMC Corporation, Ha Noi, Viet Nam
³ CMC University, CMC Corporation, Ha Noi, Viet Nam

Abstract

In this work, a multi-task learning model for age, gender, and emotion recognition on edge processing is developed. The multi-task model is based on the backbone of MobileNetV2 in which the three last layers are customized to have three outputs for age, gender, and emotion. The model was trained and tested on a dataset which is the combination of the well-known dataset, namely Internet Movie Database (IMDB) and our self-collected dataset. The trained model is then optimized and quantized to be implemented on Neural Processing Unit (NPU) of the chip RK3588 from Rockchip on Orange Pi plus hardware platform. Experimental evaluation on several testcase was performed. It is known that the multi-task model outputs prediction accuracy as high as single-task model while significantly reducing computational processing requirements. On Orange Pi platform, the highest prediction accuracy for age, gender and emotion are 3.485 MAE, 98.281%, and 93.917%, respectively. The computational performance reaches 285.7 frames per second as the highest. These results have a high potential for many practical applications on edge devices.

Keywords

Age, gender, and emotion recognition, multi-task learning, NPUs, edge processing.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

References

[1] Wang, M., Deng, W., Deep face recognition: a survey, Neurocomputing, vol. 429, pp. 215-244, Mar. 2021. https://doi.org/10.1016/j.neucom.2020.10.081
[2] Face analysis, Visage Technology, Diskettgatan 11A, 583 30 Linköping, Sweden. [Online]. Available: https://visagetechnologies.com/face-analysis/, Accessed on: 03/08/2024.
[3] Foggia, P., Greco, A., Saggese, A., and Vento, M., Multi-task learning on the edge for effective gender, age, ethnicity and emotion recognition, Engineering Applications of Artificial Intelligence, vol. 118, Feb. 2023, Art. no. 105651. https://doi.org/10.1016/j.engappai.2022.105651
[4] Sang, D. V., Cuong, L. T. B., and Thieu, V. V., Multi-task learning for smile detection, emotion recognition and gender classification in Proceedings of the 8th International Symposium on Information and Communication Technology, Dec. 2017, pp. 340-347. https://doi.org/10.1145/3155133.3155207
[5] Vu, D. Q., Phung, T. T. T., Wang, C. Y., and Wang, C., Age and gender recognition using multi-task CNN, in 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Lanzhou, China, 18-21 Nov. 2019, pp. 1937-1941. https://doi.org/10.1109/APSIPAASC47483.2019.9023045
[6] Yoo, B., Kwak, Y., Kim, Y., Choi, C., and Kim, J., Deep facial age estimation using conditional multitask learning with weak label expansion in IEEE Signal Processing Letters, vol. 25, iss. 6, Apr. 2018, pp. 808-812. https://doi.org/10.1109/LSP.2018.2822241
[7] Xu, L., Fan, H., and Xiang, J., Hierarchical multi-task network for race, gender and facial attractiveness recognition in 2019 IEEE International conference on image processing (ICIP), Taipei, Taiwan, Aug. 2019, pp. 3861-3865. https://doi.org/10.1109/ICIP.2019.8803614
[8] Ming, Z., Xia, J., Luqman, M. M., Burie, J. C., and Zhao, K., Dynamic multi-task learning for face recognition with facial expression, arXiv preprint arXiv:1911.03281, 2019. [Online]. Available: https://arxiv.org/abs/1911.03281
[9] Han, H., Jain, A. K., Wang, F., Shan, S., and Chen, X., Heterogeneous face attribute estimation: a deep multi-task learning approach, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, iss. 11, Aug. 2017, pp. 2597-2609. https://doi.org/10.1109/TPAMI.2017.2738004
[10] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L. C., MobileNetV2: inverted residuals and linear bottlenecks in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18-23 Jun. 2018, Salt Lake City, UT, USA, pp. 4510-4520. https://doi.org/10.1109/CVPR.2018.00474
[11] Lin, Y., Shen, J., Wang, Y., and Pantic, M., FP-Age: leveraging face parsing attention for facial age estimation in the wild, IEEE Transactions on Image Processing, vol. 31, pp. 5979-5992, 2022.
[12] IMDB-Clean: A novel benchmark for age estimation in the wild. [Online]. Available: https://github.com/yiminglin-ai/imdb-clean, Accessed on: 03/08/2024.
[13] Rockchip product introduction, Fuzhou, Fujian, PRC. [Online]. Available: https://www.rock-chips.com/a/en/, Accessed on: 03/08/2024.
[14] Orange Pi production introduction, Shenzhen, Guangdong, China. [Online]. Available: http://www.orangepi.org/html/hardWare/computerAndMicrocontrollers/service-and-support/Orange-Pi-5-plus.html, Accessed on: 03/08/2024.
[15] Introduction of Rockchip Toolkit. [Online]. Available: https://github.com/rockchip-linux/rknn-toolkit, Accessed on: 03/08/2024 [16] P. J. Huber, Robust estimation of a location parameter in Breakthroughs in Statistics, S. Kotz and N. L. Johnson, Eds. New York, Springer, 1992, pp. 492-518. https://doi.org/10.1007/978-1-4612-4380-9_35
[17] D. T. Tran, EfficientNet: a new approach to model scaling for convolutional neural networks (In Vietnamese: Cách tiếp cận mới về model scaling cho convolutional neural networks). [Online]. Available: https://viblo.asia/p/efficientnet-cach-tiep-can-moi-ve-model-scaling-cho-convolutional-neural-networks-Qbq5QQzm5D8, Accessed on: 03/08/2024.

Article Sidebar

Main Article Content

Abstract

Keywords

Article Details

References