ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술대회 Semi-supervised Training for Sequence-to-Sequence Speech Recognition Using Reinforcement Learning
Cited 5 time in scopus Download 0 time Share share facebook twitter linkedin kakaostory
저자
정훈, 전형배, 박전규
발행일
202007
출처
International Joint Conference on Neural Networks (IJCNN) 2020, pp.1-6
DOI
https://dx.doi.org/10.1109/IJCNN48605.2020.9207023
협약과제
20HS1700, 준지도학습형 언어지능 원천기술 및 이에 기반한 외국인 지원용 한국어 튜터링 서비스 개발, 이윤근
초록
This paper proposes a reinforcement learning based semi-supervised training approach for sequence-to-sequence automatic speech recognition (ASR) systems. Most recent semi-supervised training approaches are based on multi-loss functions such as cross-entropy loss for speech-to-text paired data and reconstruction loss for speech-text unpaired data.Although these approaches show promising results, some considerations still remain: (a) different loss functions are used for paired and unpaired data separately even though the purpose is classification accuracy improvement, and (b) several methods need auxiliary networks that increase the complexity of a semi-supervised training process.To address these issues, a reinforcement learning based approach is proposed. The proposed approach focuses on rewarding ASR to generate more correct sentences for both paired and unpaired speech data. The proposed approach is evaluated on the Wall Street Journal task domain. The experimental results show that the proposed method is effective by reducing the character error rate from 10.4% to 8.7%.
KSP 제안 키워드
Auxiliary networks, Cross-Entropy, Entropy loss, Paired data, Reinforcement Learning(RL), Speech-To-Text(STT), Wall Street, accuracy improvement, automatic speech recognition(ASR), classification accuracy, error rate