ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper [작업중] Stochastic Policy Optimization with Heuristic Information for Robot Learning
Cited 0 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Seonghyun Kim, Ingook Jang, Samyeul Noh, Hyunseok Kim
Issue Date
2021-11
Citation
Conference on Robot Learning (CoRL) 2021, pp.1-10
Language
English
Type
Conference Paper
Abstract
Stochastic policy-based deep reinforcement learning (RL) approaches have remarkably succeeded to deal with continuous control tasks. However, applying these methods to manipulation tasks remains a challenge since actuators of a robot manipulator require high dimensional continuous action spaces. In this paper, we propose exploration-bounded exploration actor-critic (EBE-AC), a novel deep RL approach to combine stochastic policy optimization with interpretable human knowledge. The human knowledge is defined as heuristic information based on both physical relationships between a robot and objects and binary signals of whether the robot has achieved certain states. The proposed approach, EBE-AC, combines an off-policy actor-critic algorithm with an entropy maximization based on the heuristic information. On a robotic manipulation task, we demonstrate that EBE-AC outperforms prior state-of-the-art off-policy actor-critic deep RL algorithms in terms of sample efficiency. In addition, we found that EBE-AC can be easily combined with latent information, where EBE-AC with latent information further improved sample efficiency and robustness.
KSP Keywords
Actor-critic algorithm, Continuous action, Continuous control, Deep reinforcement learning, Entropy Maximization, Heuristic information, High-dimensional, Human knowledge, Policy optimization, Reinforcement learning(RL), Robot Learning