ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper Zero-Crossing-Based Channel Attentive Weighting of Cepstral Features for Robust Speech Recognition: The ETRI 2011 CHiME Challenge System
Cited - time in scopus Share share facebook twitter linkedin kakaostory
Authors
Young-Ik Kim, Hoon-Young Cho, Sang-Hun Kim
Issue Date
2011-08
Citation
International Speech Communication Association (INTERSPEECH) 2011, pp.1649-1652
Publisher
ISCA
Language
English
Type
Conference Paper
Abstract
We present a practical and noise-robust speech recognition system which estimates a target-to-interferers power ratio using a zero-crossing-based binaural model and applies the power ratio to a channel attentive missing feature decoder in the cepstral domain. In a natural multisource environment, our binaural model extracts spatial cues at each zero-crossing of a filterbank output signal to localize multiple sound sources and estimates a ratio mask reliably which segregates target speech from interfering noises. Our system uses gammatone filterbank cepstral coefficients (GFCCs) for the recognition and the channel attentive decoder utilizes the ratio mask on weighting the cepstral features when calculating the output probability in the Viterbi decoding. On the experiments of CHiME final testset, our channel attentive GFCC system improves the baseline recognition result 12.2% on average, and with noisy training condition, the average improvement amounts to 18.8%.
KSP Keywords
CHiME challenge, Gammatone filterbank, Noise robust speech recognition, Spatial cue, Speech recognition system, Viterbi decoding, Zero-crossing, binaural model, cepstral coefficients, cepstral features, multiple sound sources