ETRI Knowledge Sharing Platform : A Deep Learning Convolution Architecture for Simple Embedded Applications

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper A Deep Learning Convolution Architecture for Simple Embedded Applications

Cited 0 time in scopus

Citation: International Conference on Consumer Electronics (ICCE) 2017 : Berlin, pp.20-24

Abstract: A simple AXI based convolution architecture for deep learning is presented. Input feature maps and kernel weights are stored in P KK memory blocks and convolution is done from output feature map 0 to M-1, and inside a feature map, output is generated in raster scan order. Data from P input feature maps are summed in parallel during convolution. It is possible to provide P KK input feature map data, P KK weights and the bias for the input and output feature maps being processed by manipulating the read addresses and read data alignment. Dual buffers are used to perform convolution for output feature map while DMA write for previous final output feature map is in progress. Correct operation was verified by comparing RTL simulation and C program run results. This method provides over 2,000 speed-up compared to pure software method and with flow control between DMA and convolution, much less memory can be used. This architecture can be used for convolution acceleration for moderate deep learning applications on embedded systems.

KSP Keywords: C program, Data alignment, Embedded applications, Feature map, Flow control, Input-Output, Kernel weights, RTL simulation, Scan order, Speed-up, deep learning(DL)

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.