ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper Deep Neural Network (DNN) Audio Coder Using A Perceptually Improved Training Method
Cited 7 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Seungmin Shin, Joon Byun, Youngcheol Park, Jongmo Sung, Seungkwon Beack
Issue Date
2022-05
Citation
International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022, pp.871-875
Publisher
IEEE
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1109/ICASSP43922.2022.9747575
Abstract
A new end-to-end audio coder based on a deep neural network (DNN) is proposed. To compensate for the perceptual distortion that occurred by quantization, the proposed coder is optimized to minimize distortions in both signal and perceptual domains. The distortion in the perceptual domain is measured using the psychoacoustic model (PAM), and a loss function is obtained through the two-stage compensation approach. Also, the scalar uniform quantization was approximated using a uniform stochastic noise, together with a compression-decompression scheme, which provides simpler but more stable learning without an additional penalty than the softmax quantizer. Test results showed that the proposed coder achieves more accurate noise-masking than the previous PAM-based method and better perceptual quality then the MP3 audio coder.
KSP Keywords
Deep neural network(DNN), End to End(E2E), Perceptual Quality, Psychoacoustic Model, Stochastic noise, Two-Stage, loss function, training method, uniform quantization