ETRI-Knowledge Sharing Plaform



논문 검색
구분 SCI
연도 ~ 키워드


학술대회 Distributed Deep Learning Framework based on Shared Memory for Fast Deep Neural Network Training
Cited 12 time in scopus Download 9 time Share share facebook twitter linkedin kakaostory
임은지, 안신영, 박유미, 최완
International Conference on Information and Communication Technology Convergence (ICTC) 2018, pp.1239-1242
18HS2700, 대규모 딥러닝 고속 처리를 위한 HPC 시스템 개발, 최완
In distributed deep neural network training, since the communication overhead caused by parameter sharing across multiple deep learning workers can be a performance bottleneck, performing efficient parameter sharing is a crucial challenge in distributed deep learning framework. In this paper, we propose a distributed deep learning framework called TFSM, uses remote shared memory for efficient parameter sharing to accelerate distributed DNN training. TFSM is based on the remote shared memory framework which provides shared memory accessible by multi-machines at high-speed. TFSM provides a new asynchronous parameter update method based on the remote shared memory. We confirmed that the TFSM improves the training time of DNN compared to TensorFlow by training well-known deep learning models using 8 GPU workers.
KSP 제안 키워드
Communication overhead, Deep learning framework, Deep neural network(DNN), High Speed, Neural network training, Shared Memory, Training time, deep learning(DL), deep learning models, parameter sharing, performance bottleneck