ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper Audio Coding for Machines: Saliency-Guided Masking for Concentrated Compression
Cited - time in scopus Share share facebook twitter linkedin kakaostory
Authors
Jinju Kim, Seungho Kwon, Wootaek Lim, Inseon Jang, Jong Hwan Ko
Issue Date
2026-06
Citation
한국방송·미디어공학회 학술 대회 (하계) 2026, pp.1-4
Publisher
한국방송·미디어공학회
Language
English
Type
Conference Paper
Abstract
Audio codecs today are built around the human auditory system, but the spectro-temporal cues they discard as inaudible often carry weight for machine tasks. This paper revisits Audio Coding for Machines (ACoM) through the lens of saliency-guided pre-compression masking. Building on prior work, we contribute two additional analyses on which factors contribute to optimal masking ratios in audio, and a step-wise view of how masking affects machine task when low-saliency bins are progressively removed. Together, they show that the optimal masking level is strongly sample-dependent and that predicting it is a non-trivial learning problem left to future work. Even so, simply suppressing low-saliency regions before encoding already recovers a substantial share of the compression-induced performance gap, pointing to a promising direction.
KSP Keywords
Audio coding, Dependent and, Human Auditory System, Spectro-Temporal, performance gap, pre-compression