ETRI Knowledge Sharing Platform : Progressive Multi-Stage Neural Audio Coding with Guided References

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper Progressive Multi-Stage Neural Audio Coding with Guided References

Cited 11 time in scopus

Citation: International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022, pp.876-880

Abstract: In this paper, we propose an effective multi-stage neural audio coding algorithm that encodes full-band audio signals (up to 20 kHz) using an end-to-end training criterion. By predefining several dyadic subband signals as training targets, we progressively encode input audio signals in each stage such that deeper stages of the network encode the residual error terms from the previous encoding stage. Our proposed audio codec successfully decodes full-band audio signals by using an effective multi-stage vector quantization scheme to represent key encoding features extracted in the latent space. Subjective listening tests show that the decoded outputs of the proposed audio codec achieve almost transparent quality at an average bitrate of 132 kbps.

KSP Keywords: Audio codec, Audio coding, Audio signal, End to End(E2E), Full-band, Latent space, Multi-stage vector quantization, Residual Error, Vector Quantization(VQ), coding algorithm, end-to-end training

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.