ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article SA-MARL: Novel Self-Attention-Based Multi-Agent Reinforcement Learning With Stochastic Gradient Descent
Cited 1 time in scopus Download 36 time Share share facebook twitter linkedin kakaostory
Authors
Rabbiya Younas, Hafiz Muhammad Raza Ur Rehman, Ingyu Lee, Byung-Won On, Sungwon Yi, Gyu Sang Choi
Issue Date
2025-02
Citation
IEEE Access, v.13, pp.35674-35687
ISSN
2169-3536
Publisher
Institute of Electrical and Electronics Engineers Inc.
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.1109/ACCESS.2025.3544961
Abstract
In the rapidly advancing Reinforcement Learning (RL) field, Multi-Agent Reinforcement Learning (MARL) has emerged as a key player in solving complex real-world challenges. A pivotal development in this realm is the introduction of the mixing network, representing a significant leap forward in the capabilities of multi-agent systems. Drawing inspiration from COMA and VDN methodologies, the mixing network overcomes limitations in extracting combined Q-values from joint state-action interactions. Previous approaches like COMA and VDN faced constraints in fully utilizing the state-provided information during training, limiting their effectiveness. QMIX and QVMinMax addressed this issue by employing neural networks to convert centralized states into weights for a second neural network, akin to hyper- networks. However, these solutions presented challenges such as computational intensity and susceptibility to local minima. To overcome these hurdles, our proposed methodology introduces three key contributions. First, we introduce the state- fusion network, an innovative alternative to traditional mixing, with a self-attention mechanism. Second, to address the local optima problem in MARL algorithms, we leverage the Grey Wolf Optimizer for weight and bias selection, adding a stochastic element for improved optimization. Finally, we comprehensively compare with QMIX, evaluating performance under two optimization methods: Gradient Descent and Stochastic Optimizer. Using the StarCraft II Learning Environment (SC2LE) as our experimental platform, our results demonstrate the superiority of our methodology over QMIX, QVMinMax, and QSOD in absolute performance, particularly when operating under resource constraints. Our proposed methodology contributes to the ongoing evolution of MARL techniques, showcasing advancements in attention mechanisms and optimization strategies for enhanced multi-agent system capabilities.
KSP Keywords
Attention mechanism, Grey Wolf optimizer, Local minima, Multi-agent system(MAS), Optimization methods, Optimization strategies, Q-value, Real-world, Reinforcement learning(RL), StarCraft II, Stochastic Gradient Descent
This work is distributed under the term of Creative Commons License (CCL)
(CC BY)
CC BY