ETRI-Knowledge Sharing Plaform



논문 검색
구분 SCI
연도 ~ 키워드


학술대회 Deep Neural Network (DNN) Audio Coder Using A Perceptually Improved Training Method
Cited 4 time in scopus Download 0 time Share share facebook twitter linkedin kakaostory
신승민, 변준, 박영철, 성종모, 백승권
International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022, pp.871-875
21ZH1200, 초실감 입체공간 미디어·콘텐츠 원천기술연구, 이태진
A new end-to-end audio coder based on a deep neural network (DNN) is proposed. To compensate for the perceptual distortion that occurred by quantization, the proposed coder is optimized to minimize distortions in both signal and perceptual domains. The distortion in the perceptual domain is measured using the psychoacoustic model (PAM), and a loss function is obtained through the two-stage compensation approach. Also, the scalar uniform quantization was approximated using a uniform stochastic noise, together with a compression-decompression scheme, which provides simpler but more stable learning without an additional penalty than the softmax quantizer. Test results showed that the proposed coder achieves more accurate noise-masking than the previous PAM-based method and better perceptual quality then the MP3 audio coder.
KSP 제안 키워드
Deep neural network(DNN), End to End(E2E), Perceptual Quality, Psychoacoustic Model, Stochastic noise, Two-Stage, loss function, training method, uniform quantization