ETRI Knowledge Sharing Platform : Neural Feature Predictor and Discriminative Residual Coding for Low-Bitrate Speech Coding

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper Neural Feature Predictor and Discriminative Residual Coding for Low-Bitrate Speech Coding

Cited 11 time in scopus

Citation: International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023, pp.1-5

Abstract: Low and ultra-low-bitrate neural speech codecs achieved unprecedented coding gain by generating speech signals from compact features. This paper introduces additional coding efficiency in speech coding by reducing the temporal redundancy existing in the frame-level feature sequence via a feature predictor. This predictor produces low-entropy residual representations, and we discriminatively code them based on their contribution to the signal reconstruction. Combining feature prediction and discriminative coding optimizes bitrate efficiency by assigning more bits to hard-to-predict events. We demonstrate the advantage of the proposed methods using the LPCNet as a neural vocoder, resulting in a scalable, lightweight, low-latency, and low-bitrate neural speech coding system. While our approach guarantees strict causality in the frame-level prediction, the subjective tests and feature space analysis show that our model achieves superior coding efficiency compared to the loosely-causal LPCNet and Lyra V2 in the very low bitrates.

KSP Keywords: Coding Gain, Coding efficiency, Feature Sequence, Feature prediction, Feature space analysis, Frame-level, Low latency, Residual coding, Signal Reconstruction, Speech Signals, Speech coding system

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.