ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술지 Phonetic Variation Modeling and a Language Model Adaptation for Korean English Code-Switching Speech Recognition
Cited 0 time in scopus Download 14 time Share share facebook twitter linkedin kakaostory
저자
이담허, 김동현, 윤승, 김상훈
발행일
202103
출처
Applied Sciences, v.11 no.6, pp.1-14
ISSN
2076-3417
출판사
MDPI
DOI
https://dx.doi.org/10.3390/app11062866
협약과제
21ZS1100, 자율성장형 복합인공지능 원천기술 연구, 송화전
초록
In this paper, we propose a new method for code-switching (CS) automatic speech recognition (ASR) in Korean. First, the phonetic variations in English pronunciation spoken by Korean speakers should be considered. Thus, we tried to find a unified pronunciation model based on phonetic knowledge and deep learning. Second, we extracted the CS sentences semantically similar to the target domain and then applied the language model (LM) adaptation to solve the biased modeling toward Korean due to the imbalanced training data. In this experiment, training data were AI Hub (1033 h) in Korean and Librispeech (960 h) in English. As a result, when compared to the baseline, the proposed method improved the error reduction rate (ERR) by up to 11.6% with phonetic variant modeling and by 17.3% when semantically similar sentences were applied to the LM adaptation. If we considered only English words, the word correction rate improved up to 24.2% compared to that of the baseline. The proposed method seems to be very effective in CS speech recognition.
키워드
Acoustic model, Code-switching, Domain adaptation, Language model, Shallow fusion, Speech recognition
KSP 제안 키워드
Code-switching, Correction rate, Error reduction, LM adaptation, Reduction rate, Target domain, acoustic model, automatic speech recognition(ASR), deep learning(DL), domain adaptation, language model adaptation
본 저작물은 크리에이티브 커먼즈 저작자 표시 (CC BY) 조건에 따라 이용할 수 있습니다.
저작자 표시 (CC BY)