ETRI-Knowledge Sharing Plaform



논문 검색
구분 SCI
연도 ~ 키워드


학술대회 A Deep Learning Convolution Architecture for Simple Embedded Applications
Cited 0 time in scopus Download 7 time Share share facebook twitter linkedin kakaostory
김찬, 조용철, 권영수
International Conference on Consumer Electronics (ICCE) 2017 : Berlin, pp.20-24
17HB2500, 초절전 하이퍼바이저 기반 지능정보 매니코어프로세서 및 SW기술 개발, 권영수
A simple AXI based convolution architecture for deep learning is presented. Input feature maps and kernel weights are stored in P KK memory blocks and convolution is done from output feature map 0 to M-1, and inside a feature map, output is generated in raster scan order. Data from P input feature maps are summed in parallel during convolution. It is possible to provide P KK input feature map data, P KK weights and the bias for the input and output feature maps being processed by manipulating the read addresses and read data alignment. Dual buffers are used to perform convolution for output feature map while DMA write for previous final output feature map is in progress. Correct operation was verified by comparing RTL simulation and C program run results. This method provides over 2,000 speed-up compared to pure software method and with flow control between DMA and convolution, much less memory can be used. This architecture can be used for convolution acceleration for moderate deep learning applications on embedded systems.
KSP 제안 키워드
C program, Data alignment, Embedded applications, Embedded system, Feature Map, Kernel weights, RTL simulation, Scan order, Speed-up, deep learning(DL), flow control