ETRI Knowledge Sharing Platform : Phonetic Variation Modeling and a Language Model Adaptation for Korean English Code-Switching Speech Recognition

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Journal Article Phonetic Variation Modeling and a Language Model Adaptation for Korean English Code-Switching Speech Recognition

Cited 6 time in scopus

Download 1194 time Share share

Abstract: In this paper, we propose a new method for code-switching (CS) automatic speech recognition (ASR) in Korean. First, the phonetic variations in English pronunciation spoken by Korean speakers should be considered. Thus, we tried to find a unified pronunciation model based on phonetic knowledge and deep learning. Second, we extracted the CS sentences semantically similar to the target domain and then applied the language model (LM) adaptation to solve the biased modeling toward Korean due to the imbalanced training data. In this experiment, training data were AI Hub (1033 h) in Korean and Librispeech (960 h) in English. As a result, when compared to the baseline, the proposed method improved the error reduction rate (ERR) by up to 11.6% with phonetic variant modeling and by 17.3% when semantically similar sentences were applied to the LM adaptation. If we considered only English words, the word correction rate improved up to 24.2% compared to that of the baseline. The proposed method seems to be very effective in CS speech recognition.

KSP Keywords: Code-switching, Correction rate, Error reduction, LM adaptation, Language model adaptation, Target domain, automatic speech recognition(ASR), deep learning(DL), model-based, new method, reduction rate

This work is distributed under the term of Creative Commons License (CCL)
(CC BY)

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.