ETRI Knowledge Sharing Platform : Building Robust Korean Speech Recognition Model by Fine-tuning Large Pretrained Model

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Journal Article Building Robust Korean Speech Recognition Model by Fine-tuning Large Pretrained Model

Cited - time in scopus

Download 192 time Share share

Abstract: Automatic speech recognition (ASR) has been revolutionized with deep learning-based approaches, among which self-supervised learning methods have proven to be particularly effective. In this study, we aim to enhance the performance of OpenAI’s Whisper model, a multilingual ASR system on the Korean language. Whisper was pretrained on a large corpus (around 680,000 hours) of web speech data and has demonstrated strong recognition performance for major languages. However, it faces challenges in recognizing languages such as Korean, which is not major language while training. We address this issue by fine-tuning the Whisper model with an additional dataset comprising about 1,000 hours of Korean speech. We also compare its performance against a Transformer model that was trained from scratch using the same dataset. Our results indicate that fine-tuning the Whisper model significantly improved its Korean speech recognition capabilities in terms of character error rate (CER). Specifically, the performance improved with increasing model size. However, the Whisper model’s performance on English deteriorated post fine-tuning, emphasizing the need for further research to develop robust multilingual models. Our study demonstrates the potential of utilizing a fine-tuned Whisper model for Korean ASR applications. Future work will focus on multilingual recognition and optimization for real-time inference.

KSP Keywords: Fine-tuning, Korean language, Korean speech, Learning methods, Learning-based, Performance improved, Real-time inference, Recognition model, Recognition performance, Web Speech, automatic speech recognition(ASR)

This work is distributed under the term of Creative Commons License (CCL)
(CC BY NC)

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.