ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술대회 An Energy and GPU-Computation Efficient Backbone Network for Real-Time Object Detection
Cited 230 time in scopus Download 24 time Share share facebook twitter linkedin kakaostory
저자
이영완, 황중원, 이상록, 배유석, 박종열
발행일
201906
출처
Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2019, pp.752-760
DOI
https://dx.doi.org/10.1109/CVPRW.2019.00103
협약과제
19HS3400, (딥뷰-1세부) 실시간 대규모 영상 데이터 이해·예측을 위한 고성능 비주얼 디스커버리 플랫폼 개발, 박종열
초록
As DenseNet conserves intermediate features with diverse receptive fields by aggregating them with dense connection, it shows good performance on the object detection task. Although feature reuse enables DenseNet to produce strong features with a small number of model parameters and FLOPs, the detector with DenseNet backbone shows rather slow speed and low energy efficiency. We find the linearly increasing input channel by dense connection leads to heavy memory access cost, which causes computation overhead and more energy consumption. To solve the inefficiency of DenseNet, we propose an energy and computation efficient architecture called VoVNet comprised of One-Shot Aggregation (OSA). The OSA not only adopts the strength of DenseNet that represents diversified features with multi receptive fields but also overcomes the inefficiency of dense connection by aggregating all features only once in the last feature maps. To validate the effectiveness of VoVNet as a backbone network, we design both lightweight and large-scale VoVNet and apply them to one-stage and two-stage object detectors. Our VoVNet based detectors outperform DenseNet based ones with 2× faster speed and the energy consumptions are reduced by 1.6× - 4.1×. In addition to DenseNet, VoVNet also outperforms widely used ResNet backbone with faster speed and better energy efficiency. In particular, the small object detection performance has been significantly improved over DenseNet and ResNet.
KSP 제안 키워드
Backbone Network, Detection task, Energy Efficiency, Feature Map, Low energy, Memory access cost, Model parameter, One-stage, Real-time object detection, Receptive field, Two-Stage