ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper Semi-supervised Training for Sequence-to-Sequence Speech Recognition Using Reinforcement Learning
Cited 10 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Hoon Chung, Hyeong-Bae Jeon, Jeon Gue Park
Issue Date
2020-07
Citation
International Joint Conference on Neural Networks (IJCNN) 2020, pp.1-6
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1109/IJCNN48605.2020.9207023
Abstract
This paper proposes a reinforcement learning based semi-supervised training approach for sequence-to-sequence automatic speech recognition (ASR) systems. Most recent semi-supervised training approaches are based on multi-loss functions such as cross-entropy loss for speech-to-text paired data and reconstruction loss for speech-text unpaired data.Although these approaches show promising results, some considerations still remain: (a) different loss functions are used for paired and unpaired data separately even though the purpose is classification accuracy improvement, and (b) several methods need auxiliary networks that increase the complexity of a semi-supervised training process.To address these issues, a reinforcement learning based approach is proposed. The proposed approach focuses on rewarding ASR to generate more correct sentences for both paired and unpaired speech data. The proposed approach is evaluated on the Wall Street Journal task domain. The experimental results show that the proposed method is effective by reducing the character error rate from 10.4% to 8.7%.
KSP Keywords
Auxiliary networks, Cross entropy, Entropy loss, Paired data, Reinforcement learning(RL), Speech-To-Text(STT), Wall Street, accuracy improvement, automatic speech recognition(ASR), classification accuracy, error rate