ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article Egocentric Hand Activity Video Dataset and Bidirectional Motion-Priors for Hand Action Recognition
Cited 0 time in scopus Download 547 time Share share facebook twitter linkedin kakaostory
Authors
Jiyoung Seo, Dong In Lee, Pilhyeon Lee, Jiwoo Lee, Younhee Gil, Karthik Ramani, Sangpil Kim
Issue Date
2026-01
Citation
IEEE Access, v.14, pp.8128-8148
ISSN
2169-3536
Publisher
IEEE
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.1109/ACCESS.2026.3652803
Abstract
Recognizing tool-based hand activities from a first-person view is a critical yet challenging task in computer vision, due to the complexity of hand-object interactions and often subtle, ambiguous motion patterns. In real-world manufacturing scenarios, these challenges are exacerbated by bidirectional action pairs whose visual cues are almost identical, with differences revealed only through subtle motion dynamics. However, existing datasets rarely capture these direction-sensitive interactions at scale, particularly in realistic tool-use contexts, limiting the ability of current models to learn fine-grained motion dynamics essential for accurate recognition. We introduce Ego-Bi (Egocentric-Bidirectional dataset), a large-scale, real-world egocentric RGB video dataset comprising 1,223 video sequences and 622,737 frames that cover diverse tool-use activities in unconstrained environments. Ego-Bi provides an extended 38-category hand type taxonomy, detailed object–tool labels, and challenging bidirectional action pairs, offering rich semantic and temporal cues for modeling complex hand–object interactions. In addition, to address the ambiguity in motion dynamics, we propose a BMP (Bidirectional Motion Prior module) that derives rotation and directional cues from predicted 3D hand poses to improve class separability of visually similar actions. Experimental results on Ego-Bi demonstrate that our approach improves bidirectional action recognition accuracy by + 8.96% over the baseline, while also yielding consistent gains across general action classes without requiring costly 3D pose annotations. Furthermore, the proposed motion priors generalize effectively to other egocentric benchmarks, underscoring their robustness in handling visually similar, direction-sensitive actions.
Keyword
dynamic motion cue, Hand action recognition, hand pose estimation, hand type taxonomy, hand-object interaction
KSP Keywords
3D pose, Accurate Recognition, Action recognition, Class separability, Computer Vision(CV), Directional cues, Dynamic motion, Fine grained(FG), Hand-object interaction, Motion cue, Motion dynamics
This work is distributed under the term of Creative Commons License (CCL)
(CC BY)
CC BY