ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article Phonetic Variation Modeling and a Language Model Adaptation for Korean English Code-Switching Speech Recognition
Cited 5 time in scopus Download 347 time Share share facebook twitter linkedin kakaostory
Authors
Damheo Lee, Donghyun Kim, Seung Yun, Sanghun Kim
Issue Date
2021-03
Citation
Applied Sciences, v.11, no.6, pp.1-14
ISSN
2076-3417
Publisher
MDPI
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.3390/app11062866
Abstract
In this paper, we propose a new method for code-switching (CS) automatic speech recognition (ASR) in Korean. First, the phonetic variations in English pronunciation spoken by Korean speakers should be considered. Thus, we tried to find a unified pronunciation model based on phonetic knowledge and deep learning. Second, we extracted the CS sentences semantically similar to the target domain and then applied the language model (LM) adaptation to solve the biased modeling toward Korean due to the imbalanced training data. In this experiment, training data were AI Hub (1033 h) in Korean and Librispeech (960 h) in English. As a result, when compared to the baseline, the proposed method improved the error reduction rate (ERR) by up to 11.6% with phonetic variant modeling and by 17.3% when semantically similar sentences were applied to the LM adaptation. If we considered only English words, the word correction rate improved up to 24.2% compared to that of the baseline. The proposed method seems to be very effective in CS speech recognition.
KSP Keywords
Code-switching, Correction rate, Error reduction, LM adaptation, Target domain, automatic speech recognition(ASR), deep learning(DL), language model adaptation, model-based, new method, reduction rate
This work is distributed under the term of Creative Commons License (CCL)
(CC BY)
CC BY