ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper Neural Feature Predictor and Discriminative Residual Coding for Low-Bitrate Speech Coding
Cited 10 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Haici Yang, Wootaek Lim, Minje Kim
Issue Date
2023-06
Citation
International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023, pp.1-5
Publisher
IEEE
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1109/ICASSP49357.2023.10096077
Abstract
Low and ultra-low-bitrate neural speech codecs achieved unprecedented coding gain by generating speech signals from compact features. This paper introduces additional coding efficiency in speech coding by reducing the temporal redundancy existing in the frame-level feature sequence via a feature predictor. This predictor produces low-entropy residual representations, and we discriminatively code them based on their contribution to the signal reconstruction. Combining feature prediction and discriminative coding optimizes bitrate efficiency by assigning more bits to hard-to-predict events. We demonstrate the advantage of the proposed methods using the LPCNet as a neural vocoder, resulting in a scalable, lightweight, low-latency, and low-bitrate neural speech coding system. While our approach guarantees strict causality in the frame-level prediction, the subjective tests and feature space analysis show that our model achieves superior coding efficiency compared to the loosely-causal LPCNet and Lyra V2 in the very low bitrates.
KSP Keywords
Coding Gain, Coding efficiency, Feature Sequence, Feature prediction, Feature space analysis, Frame-level, Low latency, Residual coding, Signal Reconstruction, Speech Signals, Speech coding system