ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper The High-performance convolution design and implementation using parallel memory processing and shift register pipeline
Cited 0 time in scopus Share share facebook twitter linkedin kakaostory
Authors
YoungSeok Baek, BonTae Koo
Issue Date
2024-01
Citation
International Conference on Electronics, Information and Communication (ICEIC) 2024, pp.1253-1256
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1109/ICEIC61013.2024.10457129
Abstract
This paper addresses the hardware implementation of a CNN deep learning system, focusing on the method for implementing Convolution Filters, which are known to cause time bottlenecks due to data processing and computations. When applying a 3 × 3 or 5 × 5 filter to obtain a single output pixel's value in a Convolution Filter, it requires reading 9 or 25 data from memory. Furthermore, multiple clock cycles are needed for MAC (Multiply-Accumulate) processing on these data. Since memory can read only 1 or 2 data at a time, this results in numerous memory reads, ranging from several to tens of times. In this paper, a solution is presented to process Convolution Filters efficiently and cost-effectively using parallel memory processing techniques and a pipeline processing approach with shift registers.
KSP Keywords
Convolution filters, Data processing, Hardware implementation, High performance, Multiply and Accumulate(MAC), Parallel memory, Read-only, Shift Register, deep learning(DL), design and implementation, learning system