ETRI Knowledge Sharing Platform : The High-performance convolution design and implementation using parallel memory processing and shift register pipeline

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper The High-performance convolution design and implementation using parallel memory processing and shift register pipeline

Cited 0 time in scopus

Citation: International Conference on Electronics, Information and Communication (ICEIC) 2024, pp.1253-1256

Abstract: This paper addresses the hardware implementation of a CNN deep learning system, focusing on the method for implementing Convolution Filters, which are known to cause time bottlenecks due to data processing and computations. When applying a 3 × 3 or 5 × 5 filter to obtain a single output pixel's value in a Convolution Filter, it requires reading 9 or 25 data from memory. Furthermore, multiple clock cycles are needed for MAC (Multiply-Accumulate) processing on these data. Since memory can read only 1 or 2 data at a time, this results in numerous memory reads, ranging from several to tens of times. In this paper, a solution is presented to process Convolution Filters efficiently and cost-effectively using parallel memory processing techniques and a pipeline processing approach with shift registers.

KSP Keywords: Convolution filters, Data processing, Hardware implementation, High performance, Multiply and Accumulate(MAC), Parallel memory, Read-only, Shift Register, deep learning(DL), design and implementation, learning system

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.