ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술대회 Effects of Hyper-Parameters for Deep Reinforcement Learning in Robotic Motion Mimicry: A Preliminary Study
Cited 3 time in scopus Download 3 time Share share facebook twitter linkedin kakaostory
저자
김태우, 이주행
발행일
201906
출처
International Conference on Ubiquitous Robots (UR) 2019, pp.228-235
DOI
https://dx.doi.org/10.1109/URAI.2019.8768564
협약과제
19HS6200, 고령 사회에 대응하기 위한 실환경 휴먼케어 로봇 기술 개발, 이재연
초록
When applying deep reinforcement learning to the motion mimicry problem between teacher and student robots, this paper reports the initial results of how various hyper-parameter configurations affect performance of learning processes and quality of generated motions. The hyperparameters considered in this study include the structure of policies such as convolutional and fully connected networks, the type of activation functions such as ReLU and hyperbolic tangent, and the number of input sequences such as one, four and eight. Under these deep neural network configurations, PPO reinforcement learning algorithm has been applied for learning. In the simulator environment, the teacher NAO robot demonstrates a target action repeatedly, and the learner NAO robot tries to learn that action. The target actions include handshaking and two-arm raising. Our experimental results show that fully connected networks outperform the convolutional counterparts both in training statistics and motion quality. For activation functions, however, we found an interesting mismatch between training and evaluation quality: for example, a configuration with higher rewards does not guarantee less motion discrepancy, which may suggest a new research direction to design better loss and reward functions for robotic motion mimicry.
KSP 제안 키워드
Activation function, Deep neural network(DNN), Deep reinforcement learning, Hyperbolic tangent, Nao robot, Network configuration, Preliminary study, Reinforcement Learning(RL), Reinforcement learning algorithm, Research direction, Training and evaluation