ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article MuDE: Multi-Agent Decomposed Reward-Based Exploration
Cited 2 time in scopus Download 356 time Share share facebook twitter linkedin kakaostory
Authors
Byunghyun Yoo, Sungwon Yi, Hyunwoo Kim, Younghwan Shin, Ran Han, Seungwoo Seo, Hwa Jeon Song, Euisok Chung, Jeongmin Yang
Issue Date
2024-11
Citation
Neural Networks, v.179, pp.1-13
ISSN
0893-6080
Publisher
Elsevier Ltd.
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.1016/j.neunet.2024.106565
Abstract
In cooperative multi-agent reinforcement learning, agents jointly optimize a centralized value function based on the rewards shared by all agents and learn decentralized policies through value function decomposition. Although such a learning framework is considered effective, estimating individual contribution from the rewards, which is essential for learning highly cooperative behaviors, is difficult. In addition, it becomes more challenging when reinforcement and punishment, help in increasing or decreasing the specific behaviors of agents, coexist because the processes of maximizing reinforcement and minimizing punishment can often conflict in practice. This study proposes a novel exploration scheme called multi-agent decomposed reward-based exploration (MuDE), which preferably explores the action spaces associated with positive sub-rewards based on a modified reward decomposition scheme, thus effectively exploring action spaces not reachable by existing exploration schemes. We evaluate MuDE with a challenging set of StarCraft II micromanagement and modified predator–prey tasks extended to include reinforcement and punishment. The results show that MuDE accurately estimates sub-rewards and outperforms state-of-the-art approaches in both convergence speed and win rates.
KSP Keywords
Cooperative behaviors, Cooperative multi-agent, Decomposition scheme, Function decomposition, Learning framework, Reinforcement learning(RL), Reward-Based, StarCraft II, convergence speed, multi-agent reinforcement learning, state-of-The-Art
This work is distributed under the term of Creative Commons License (CCL)
(CC BY NC ND)
CC BY NC ND