ETRI Knowledge Sharing Platform : Deep Neural Network (DNN) Audio Coder Using A Perceptually Improved Training Method

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper Deep Neural Network (DNN) Audio Coder Using A Perceptually Improved Training Method

Cited 9 time in scopus

Authors: Seungmin Shin, Joon Byun, Youngcheol Park, Jongmo Sung, Seungkwon Beack

Citation: International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022, pp.871-875

Abstract: A new end-to-end audio coder based on a deep neural network (DNN) is proposed. To compensate for the perceptual distortion that occurred by quantization, the proposed coder is optimized to minimize distortions in both signal and perceptual domains. The distortion in the perceptual domain is measured using the psychoacoustic model (PAM), and a loss function is obtained through the two-stage compensation approach. Also, the scalar uniform quantization was approximated using a uniform stochastic noise, together with a compression-decompression scheme, which provides simpler but more stable learning without an additional penalty than the softmax quantizer. Test results showed that the proposed coder achieves more accurate noise-masking than the previous PAM-based method and better perceptual quality then the MP3 audio coder.

KSP Keywords: Deep neural network(DNN), End to End(E2E), Perceptual Quality, Psychoacoustic Model, Stochastic noise, Two-Stage, loss function, neural network(NN), training method, uniform quantization

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.