ETRI Knowledge Sharing Platform : A Useful Feature-Engineering Approach for a LVCSR System Based on CD-DNN-HMM Algorithm

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper A Useful Feature-Engineering Approach for a LVCSR System Based on CD-DNN-HMM Algorithm

Cited - time in scopus

Citation: European Signal Processing Conference (EUSIPCO) 2015, pp.1436-1440

Abstract: In this paper, we propose a useful feature-engineering approach for Context-Dependent Deep-Neural-Network Hidden-Markov-Model (CD-DNN-HMM) based Large-Vocabulary-Continuous-Speech-Recognition (LVCSR) systems. The speech recognition performance of a LVCSR system is improved from two feature-engineering perspectives. The first performance improvement is achieved by adopting the intra/inter-frame feature subsets when the Gaussian-Mixture-Model (GMM) HMMs for the HMM state-level alignment are built. And the second performance gain is then followed with the additional features augmenting the front-end of the DNN. We evaluate the effectiveness of our feature-engineering approach under a series of Korean speech recognition tasks (isolated single-syllable recognition with a medium-sized speech corpus and conversational speech recognition with a large-sized database) using the Kaldi speech recognition toolkit. The results show that the proposed featureengineering approach outperforms the traditional Mel Frequency Cepstral Coefficient (MFCCs) GMM + Melfrequency filter-bank output DNN method.

KSP Keywords: CD-DNN-HMM, Conversational speech recognition, Engineering approach, Filter bank, Front-End, Inter-frame, Korean speech, Level alignment, Markov model, Mel-frequency Cepstral Coefficient(MFCC), Performance gain

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.