ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술대회 Periodic Communication for Distributed Multi-agent Reinforcement Learning under Partially Observable Environment
Cited 1 time in scopus Download 6 time Share share facebook twitter linkedin kakaostory
저자
김성현, 이동훈, 장인국, 김현석, 손영성
발행일
201910
출처
International Conference on Information and Communication Technology Convergence (ICTC) 2019, pp.940-942
DOI
https://dx.doi.org/10.1109/ICTC46691.2019.8939754
협약과제
19ZH1100, 사물-사람-공간의 유기적 연결을 위한 초연결 공간의 분산 지능 핵심원천 기술, 박준희
초록
In area of the reinforcement learning, an environment is important because when a well-known reinforcement learning technique for an environment is applied to another environment, it does not guarantee whether the technique also works well or not. To apply reinforcement learning techniques to real world environments, a partial observation condition is one of important issues. In addition, a communication is also important for a multi-agent setting under the partial observation condition. In this paper, a periodic communication for distributed multi-agent reinforcement learning is studied. Via the periodic communication, each agent is able to share an auxiliary observation which is a compressed version of an observation. After sharing auxiliary observations, each agent makes a decision based on own observation and other's shared observation. However, due to the periodicity of a communication, an absence of sharing auxiliary observation occurs for a non-communication phase. Thus, it is necessary to consider how to compensate the absence of shared auxiliary observations within non-communication phase. To this end, several methods are proposed to predict auxiliary observations of other agents. In simulation results, it is shown that an affection of the partial observation condition and performances according to the methods of compensation of auxiliary observations.
KSP 제안 키워드
Distributed multi-agent, Periodic Communication, Real-world, Reinforcement Learning(RL), multi-agent reinforcement learning, partial observations, simulation results