ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper Feature-Guided Machine-Centric Image Coding for Downstream Tasks
Cited 0 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Sangwoon Kwak, Joungil Yun, Hyon-Gon Choo, Munchurl Kim
Issue Date
2023-07
Citation
International Conference on Multimedia and Expo (ICME) 2023, pp.176-181
Publisher
IEEE
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1109/ICMEW59549.2023.00037
Abstract
Video coding, a process of compressing and decompressing digital video content, has traditionally been optimized for human visual systems by reducing its size while maintaining the human perceptual quality. However, with the remarkable progress of artificial intelligence (AI) technology, the need for machine-centric coding has rapidly been increasing in recent years. In response to these trends, international standardization organizations such as MPEG are actively working to develop and launch new standards on coding technologies for machines, called video coding for machines (VCM). In this paper, we present a novel feature-guided block-wise image blending method for image compression, which is suitable for machine applications such as object detection and segmentation. For this, we use a gradient map of the feature loss using the pretrained encoder part of a task-specific network as guide for input degradation, so that the degraded input images can be effectively compressed for machine-centric tasks. Our method is simple but effective because additional training is not required by utilizing the pretrained encoder parts of networks for targeted tasks. Experimental results show that BD-rate gains can be obtained by applying our proposed method with averages 11% and 8% for object detection and instance segmentation tasks, respectively, compared to the image anchor results of MPEG-VCM reference software v0.4.
KSP Keywords
Artificial intelligence (AI) technology, Blending method, Gradient map, Image Compression, Image coding, Object detection, Perceptual Quality, Task-specific, Video coding, Video contents, digital video