ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper 이종 GPU 클러스터를 위한 분산 딥러닝 학습 플랫폼
Cited - time in scopus Share share facebook twitter linkedin kakaostory
Authors
안신영
Issue Date
2023-06
Citation
대한전자공학회 학술 대회 (하계) 2023, pp.2694-2697
Publisher
대한전자공학회
Language
Korean
Type
Conference Paper
Abstract
As artificial intelligence technology matures, the competition for developing DNNs with higher accuracy has become increasingly intense. To achieve higher performance, the size of DNNs is increasing, leading to a significant increase in the time and cost of development. In this paper, we propose EDDIS, a distributed DNN training platform that integrates heterogeneous GPU resources to provide high-speed distributed training to support faster DNN training. EDDIS provides a methodology for modifying existing Tensorflow/PyTorch codes to enable distributed training and also offers an asynchronous parameter update method to solve the straggler problem associated with synchronous parameter update methods, thus providing superior distributed training performance.
KSP Keywords
Artificial intelligence technology, Distributed training, High Speed, Higher performance, Straggler problem