ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술대회 TeachMe: Three-phase Learning Framework for Robotic Motion Imitation based on Interactive Teaching and Reinforcement Learning
Cited 1 time in scopus Download 0 time Share share facebook twitter linkedin kakaostory
저자
김태우, 이주행
발행일
201910
출처
International Symposium on Robot and Human Interactive Communication (RO-MAN) 2019, pp.1-8
DOI
https://dx.doi.org/10.1109/RO-MAN46459.2019.8956326
협약과제
19HS6200, 고령 사회에 대응하기 위한 실환경 휴먼케어 로봇 기술 개발, 이재연
초록
Motion imitation is a fundamental communication skill for a robot; especially, as a nonverbal interaction with a human. Owing to kinematic configuration differences between the human and the robot, it is challenging to determine the appropriate mapping between the two pose domains. Moreover, technical limitations while extracting 3D motion details, such as wrist joint movements from human motion videos, results in significant challenges in motion retargeting. Explicit mapping over different motion domains indicates a considerably inefficient solution. To solve these problems, we propose a three-phase reinforcement learning scheme to enable a NAO robot to learn motions from human pose skeletons extracted from video inputs. Our learning scheme consists of three phases: (i) phase one for learning preparation, (ii) phase two for a simulation-based reinforcement learning, and (iii) phase three for a human-in-the-loop-based reinforcement learning. In phase one, embeddings of the motions of a human skeleton and robot are learned by an autoencoder. In phase two, the NAO robot learns a rough imitation skill using reinforcement learning that translates the learned embeddings. In the last phase, the robot learns motion details that were not considered in the previous phases by interactively setting rewards based on direct teaching instead of the method used in the previous phase. Especially, it is to be noted that a relatively smaller number of interactive inputs are required for motion details in phase three when compared to the large volume of training sets required for overall imitation in phase two. The experimental results demonstrate that the proposed method improves the imitation skills efficiently for hand waving and saluting motions obtained from NTU-DB.
KSP 제안 키워드
3D motion, Communication skill, Human Skeleton, Human motion, Human pose, Human-in-the-Loop, Interactive teaching, Learning framework, Motion Retargeting, Motion imitation, Nao robot