ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article Automatic Recognition of Children Engagement from Facial Video using Convolutional Neural Networks
Cited 39 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Woo-han Yun, Dongjin Lee, Chankyu Park, Jaehong Kim, Junmo Kim
Issue Date
2020-10
Citation
IEEE Transactions on Affective Computing, v.11, no.4, pp.696-707
ISSN
1949-3045
Publisher
IEEE
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.1109/TAFFC.2018.2834350
Abstract
Automatic engagement recognition is a technique that is used to measure the engagement level of people in a specific task. Although previous research has utilized expensive and intrusive devices such as physiological sensors and pressure-sensing chairs, methods using RGB video cameras have become the most common because of the cost efficiency and noninvasiveness of video cameras. Automatic engagement recognition methods using video cameras are usually based on hand-crafted features and a statistical temporal dynamics modeling algorithm. This paper proposes a data-driven convolutional neural networks (CNNs)-based engagement recognition method that uses only facial images from input videos. As the amount of data in a dataset of children's engagement is insufficient for deep learning, pre-trained CNNs are utilized for low-level feature extraction from each video frame. In particular, a new layer combination for temporal dynamics modeling is employed to extract high-level features from low-level features. Experimental results on a database created using images of children from kindergarten demonstrate that the performance of the proposed method is superior to that of previous methods. The results indicate that the engagement level of children can be gauged automatically via deep learning even when the available database is deficient.
KSP Keywords
Automatic recognition, Convolution neural network(CNN), Cost Efficiency, Data-Driven, Dynamics modeling, Facial image, Layer combination, Recognition method, Temporal Dynamics, deep learning(DL), facial video