ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article On Analysis of Clipped Critic Loss in Proximal Policy Optimization
Cited 0 time in scopus Download 38 time Share share facebook twitter linkedin kakaostory
Authors
Yongjin Lee, Moonyoung Chung
Issue Date
2026-01
Citation
Electronics Letters, v.62, no.1, pp.1-5
ISSN
0013-5194
Publisher
John Wiley & Sons
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.1049/ell2.70545
Abstract
Proximal policy optimization (PPO) is a widely used reinforcement learning algorithm valued for its robustness and sample efficiency. Its success is often attributed to the actor's clipped loss, which keeps policy updates within a trust region. In contrast, the critic's clipped loss has received relatively little attention, leaving its consistency with the trust-region principle unclear. To bridge this gap, we analyze the critic's clipped loss, show its misalignment, and propose a refined loss that enforces trust-region compliance by construction. Experiments on continuous-control tasks confirm that the proposed method improves adherence to the trust region.
Keyword
artificial intelligence, learning (artificial intelligence)
KSP Keywords
Policy optimization, Reinforcement learning(RL), Reinforcement learning algorithm, artificial intelligence, trust region
This work is distributed under the term of Creative Commons License (CCL)
(CC BY NC ND)
CC BY NC ND