ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article Target-Aware Neural Network Execution via Compiler-Guided Pruning
Cited 0 time in scopus Download 122 time Share share facebook twitter linkedin kakaostory
Authors
JooHyoung Cha, Taeho Kim, Jemin Lee, Sangtae Ha, Yongin Kwon
Issue Date
2025-11
Citation
IEEE Transactions on Mobile Computing, v.권호미정, pp.1-14
ISSN
1536-1233
Publisher
IEEE
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.1109/TMC.2025.3631673
Abstract
Mobile devices run deep learning models for various purposes, such as image classification and speech recognition. Due to the resource constraints of mobile devices, researchers have focused on either making a lightweight deep neural network (DNN) model using model pruning or generating an efficient code using compiler optimization. It was observed that the straightforward integration between model compression and compiler auto-tuning often fails to produce the most efficient model for a target device. We propose CPrune, a compiler-informed model pruning for efficient target-aware DNN execution to support an application with a required target accuracy. To address real-world deployment scenarios with resource or latency constraints, we further introduce RB-CPrune, a predictive variant that eliminates iterative tuning by using a learned latency estimator. CPrune makes a lightweight DNN model through informed pruning based on the structural information of subgraphs built during the compiler tuning process. Our experimental results show that CPrune increases the DNN execution speed up to 2.73× compared to the state-of-the-art TVM auto-tune while meeting the accuracy requirement.
This work is distributed under the term of Creative Commons License (CCL)
(CC BY NC ND)
CC BY NC ND