ETRI Knowledge Sharing Platform : Learning Multi-modal Attentional Consensus in Action Recognition for Elderly-Care Robots

BROWSE

Titles

논문 검색
Type		SCI
Year	~	Keyword

Detail

List

Conference Paper Learning Multi-modal Attentional Consensus in Action Recognition for Elderly-Care Robots

Cited 4 time in scopus

Authors: Hyungmin Kim, Dohyung Kim, Jaehong Kim

Issue Date: 2021-07

Citation: International Conference on Ubiquitous Robots (UR) 2021, pp.308-313

Publisher: IEEE

Language: English

Type: Conference Paper

DOI: https://dx.doi.org/10.1109/UR52253.2021.9494666

Abstract: This paper addresses a practical action recognition method for elderly-care robots. Multi-stream based models are one of the promising approaches for solving the complexity of real-world environments. While multi-modal action recognition have been actively studied, there is a lack of research on models that effectively combine features of different modalities. This paper proposes a new mid-level feature fusion method for two-stream based action recognition network. In multi-modal approaches, extracting complementary information between different modalities is an essential task. Our network model is designed to fuse features at an intermediate level of feature extraction, which leverages a whole feature map from each modality. Consensus feature map and consensus attention mechanism are proposed as effective ways to extract information from two different modalities: RGB data and motion features. We also introduce ETRI-Activity3D-LivingLab, a real-world RGB-D dataset for robots to recognize daily activities of the elderly. It is the first 3D action recognition dataset obtained in a variety of home environments where the elderly actually reside. We expect our new dataset to contribute to the practical study of action recognition with the previously released ETRI-Activity3D dataset. To prove the effectiveness of the method, extensive experiments are performed on NTU RGB+D, ETRI-Activity3D and, ETRI-Activity3D-LivingLab dataset. Our mid-level fusion method achieves competitive performance in various experimental settings, especially for domain-changing situations.

KSP Keywords: 3D action recognition, Attention mechanism, Competitive performance, Daily activities, Feature extractioN, Feature fusion, Feature map, Fusion method, Multi-modal, Multi-stream, Network Model

ETRI

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.

제1유형

ETRI-Knowledge Sharing Plaform

BROWSE

Titles

Detail

ETRI