ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper Quantization Noise Masking in Perceptual Neural Audio Coder
Cited - time in scopus Share share facebook twitter linkedin kakaostory
Authors
Seungmin Shin, Joon Byun, Jongmo Sung, Seungkwon Beack, Youngcheol Park
Issue Date
2024-04
Citation
International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024, pp.1246-1250
Publisher
IEEE
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1109/ICASSP48485.2024.10446359
Abstract
This study investigates the implication of utilizing the psychoacoustic model (PAM) within the neural audio coder (NAC), specifically focusing on the masking of quantization noise. We introduce a novel training strategy to incorporate the PAM into the NAC more accurately. This method involves a discriminator that directly or indirectly measures the PAM loss. For the indirect measurement, a multi-scale STFT discriminator (MS-STFTD) is incorporated to introduce an auxiliary loss term in addition to the existing PAM loss. Conversely, for the direct measurement, we have designed a multi-scale PAM discriminator (MS-PAMD) that quantifies PAM-specific parameters. Experimental results show that adding the discriminator masks the quantization noise better than the previous NAC, and it obtains audio quality comparable to the commercial AAC in both objective and subjective scores.
KSP Keywords
Audio quality, Multi-scale, Psychoacoustic Model, Quantization Noise, indirect measurement