ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술지 Knowledge Transfer for On-Device Deep Reinforcement Learning in Resource Constrained Edge Computing Systems
Cited 18 time in scopus Download 159 time Share share facebook twitter linkedin kakaostory
저자
장인국, 김현석, 이동훈, 손영성, 김성현
발행일
202008
출처
IEEE Access, v.8, pp.146588-146597
ISSN
2169-3536
출판사
IEEE
DOI
https://dx.doi.org/10.1109/ACCESS.2020.3014922
협약과제
20ZR1100, 자율적으로 연결·제어·진화하는 초연결 지능화 기술 연구, 박준희
초록
Deep reinforcement learning (DRL) is a promising approach for developing control policies by learning how to perform tasks. Edge devices are required to control their actions by exploiting DRL to solve tasks autonomously in various applications such as smart manufacturing and autonomous driving. However, the resource limitations of edge devices make it unfeasible for them to train their policies from scratch. It is also impractical for such an edge device to use the policy with a large number of layers and parameters, which is pre-trained by a centralized cloud infrastructure with high computational power. In this paper, we propose a method, on-device DRL with distillation (OD3), to efficiently transfer distilled knowledge of how to behave for on-device DRL in resource-constrained edge computing systems. Our proposed method makes it possible to simultaneously perform knowledge transfer and policy model compression in a single training process on edge devices with considering their limited resource budgets. The novelty of our method is to apply a knowledge distillation approach to DRL based edge device control in integrated edge cloud environments. We analyze the performance of the proposed method by implementing it on a commercial embedded system-on-module equipped with limited hardware resources. The experimental results show that 1) edge policy training with the proposed method achieves near-cloud-performance in terms of average rewards, although the size of the edge policy network is significantly smaller compared to that of the cloud policy network and 2) the training time elapsed for edge policy training with our method is reduced significantly compared to edge policy training from scratch.
KSP 제안 키워드
Cloud policy, Computational Power, Control policy, Deep reinforcement learning, Device Control, Edge cloud, Edge devices, Embedded system, Hardware Resources, Knowledge transfer, Model compression
본 저작물은 크리에이티브 커먼즈 저작자 표시 (CC BY) 조건에 따라 이용할 수 있습니다.
저작자 표시 (CC BY)