ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article Facial Landmark, Head Pose, and Occlusion Analysis using Multitask Stacked Hourglass
Cited 2 time in scopus Download 148 time Share share facebook twitter linkedin kakaostory
Authors
Youngsam Kim, Jong-Hyuk Roh, Soohyung Kim
Issue Date
2023-03
Citation
IEEE Access, v.11, pp.30970-30981
ISSN
2169-3536
Publisher
Institute of Electrical and Electronics Engineers Inc.
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.1109/ACCESS.2023.3262247
Abstract
In this study, we proposed a multitask network architecture for three attributes, landmark, head pose, and occlusion, from a face image. A 2-stacked hourglass with three task-specific heads is the proposed network architecture. We also designed three auxiliary components for the network. First is the feature pyramid fusion module, which plays a crucial role in facilitating contextual information from various receptive fields. Second is the interlevel occlusion-aware fusion module, which explicitly fuses intermediate occlusion prediction between subnetworks. The third is the gimbal-lock-free head pose head, which outputs a rotation matrix from a 6D rotation representation. We conducted an ablative study of these auxiliary components to determine their impacts on the network. Additionally, we introduced the landmark heatmap scaling approach to avoid falling local minima. We trained the proposed network with a 300W-LP dataset for landmark and head pose and a C-CM dataset for occlusion. Then, we fine-tuned the network using the 300W or WFLW dataset, instead of the 300W-LP dataset for the landmark task. This 2-stage training method contributes to enhancing the landmark detection accuracy and that of other tasks. In the experiments, we assessed the performance of the proposed network on eight test datasets using task-specific metrics. The results show that the proposed network achieved competitive performance across all the datasets and notably outperformed the state-of-the-art methods on AFLW2000 and Masked 300W datasets.
KSP Keywords
Competitive performance, Contextual information, Detection accuracy, Face image, Head pose, Local minima, Network Architecture, Occlusion analysis, Occlusion prediction, Receptive field, Rotation matrix
This work is distributed under the term of Creative Commons License (CCL)
(CC BY NC ND)
CC BY NC ND