ETRI-Knowledge Sharing Plaform

표준안 검색
표준화기구 연도 ~


New Annex E on superwideband scalable extension

이미숙, 김보연, 이명숙, 김현우, 성종모, 김도영, 김동규, 이수영, 권안나, 김현우
ITU-T G.729.1 Amd. 6
13PI2200, 다자간 협업을 위한 몰입형 스마트워크 핵심기술 개발, 김도영
Recommendation ITU-T G.729.1 describes an 8-32 kbit/s scalable wideband speech and audio coding algorithm interoperable with ITU-T G.729, ITU-T G.729A and ITU-T G.729B.

The output of the ITU-T G.729EV coder has a bandwidth of 50-4000 Hz at 8 and 12 kbit/s and 50‑7000 Hz from 14 to 32 kbit/s. At 8 kbit/s, ITU-T G.729EV is fully interoperable with ITU‑T G.729, Annex A/G.729 and Annex B/G.729. Hence, an efficient deployment in existing ITU‑T G.729-based VoIP infrastructures is foreseen. The coder operates on 20 ms frames and has an algorithmic delay of 48.9375 ms. By default, the encoder input and decoder output are sampled at 16 kHz.

The encoder produces an embedded bitstream structured in 12 layers corresponding to 12 available bit rates from 8 to 32 kbit/s. The bitstream can be truncated at the decoder side or by any component of the communication system to adjust "on the fly" the bit rate to the desired value with no need for outband signalling.

The underlying algorithm is based on a three-stage coding structure: embedded Code-Excited Linear Prediction (CELP) coding of the lower band (50-4000 Hz), parametric coding of the higher band (4000-7000 Hz) by Time-Domain Bandwidth Extension (TDBWE), and enhancement of the full band (50-7000 Hz) by a predictive transform coding technique referred to as Time-Domain Aliasing Cancellation (TDAC).

Amendment 1 introduces the new Annex A containing the RTP payload format, capability identifiers and parameters for signalling of ITU-T G.729.1 capabilities using ITU-T H.245. Both format and capability parameters are fully compatible with the corresponding ITU-T G.729.1 RTP definitions to allow seamless interoperability. Besides the new Annex, Amendment 1 to ITU-T G.729.1 incorporates changes needed to correct defects in ITU-T G.729.1 and provides new, more comprehensive test vectors.

Amendment 2 introduces the new Annex B, which defines an alternative implementation of the ITU‑T G.729.1 algorithm using floating point arithmetic to be used for implementation on DSP hardware optimized for floating-point operations. The accompanying floating point C-code is fully interoperable with the fixed-point C-code.

Amendment 3 extends the low-delay functionality of main body and Annex B to the first wideband bit rate (14 kbit/s). It also incorporates changes needed to correct defects in the text and C-code of ITU-T G.729.1 main body and Annex B.

Amendment 4 introduces a new Annex C specifying a discontinuous transmission (DTX) and comfort noise generation for ITU-T G.729.1. With this annex, the ITU-T G.729.1 encoder is capable of generating a silence insertion description (SID) each time an update of the ambient background noise parameters is required to maintain the quality of the generated background noise. The SID information includes a core lower band layer, which can be decoded by the decoder of Annex B/G.729, an enhancement lower band layer and a higher band layer. The non-transmission between SID updates and the small size of the SID provide a significant reduction of bandwidth during inactive segments. Besides this new annex, Amendment 4 incorporates changes needed to correct defects identified in ITU-T G.729.1 C source code (main body and Annex B), provides a revised set of test vectors, and updates the complexity figures table of ITU-T G.729.1 text.

Annex D introduced by Amendment 5 provides an alternative implementation using floating point arithmetic of the discontinuous transmission (DTX) and comfort noise generation (CNG) of Annex C – which uses fixed-point arithmetic. Besides this new annex, Amendment 5 incorporates changes needed to correct defects identified in the ITU-T G.729.1 C source code for its main body and Annex B, and provides a revised set of test vectors.

Corrigendum 1 addressed some problems discovered recently in the ANSI C-codes of the main body of ITU-T G.729.1 and of its Annexes B, C and D, in the so-called Release 1.5 of the code.

Amendment 6 brings in new Annex E a scalable superwideband (SWB, 50‑14000 Hz) speech and audio coding algorithm operating from 36 to 64 kbit/s and interoperable with ITU-T G.729 and ITU‑T G.729.1.

For consistency, the existing ANSI-C code for the whole of ITU-T G.729.1 is reissued as part of this publication and labelled as Release 1.6, without any additional change. Test vectors that complement this release are also available in the ITU-T test signal database at G.729.1.