ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article Speech Recognition for Task Domains with Sparse Matched Training Data
Cited 4 time in scopus Download 274 time Share share facebook twitter linkedin kakaostory
Authors
Byung Ok Kang, Hyeong Bae Jeon, Jeon Gue Park
Issue Date
2020-09
Citation
Applied Sciences, v.10, no.18, pp.1-15
ISSN
2076-3417
Publisher
MDPI
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.3390/app10186155
Abstract
We propose two approaches to handle speech recognition for task domains with sparse matched training data. One is an active learning method that selects training data for the target domain from another general domain that already has a significant amount of labeled speech data. This method uses attribute-disentangled latent variables. For the active learning process, we designed an integrated system consisting of a variational autoencoder with an encoder that infers latent variables with disentangled attributes from the input speech, and a classifier that selects training data with attributes matching the target domain. The other method combines data augmentation methods for generating matched target domain speech data and transfer learning methods based on teacher/student learning. To evaluate the proposed method, we experimented with various task domains with sparse matched training data. The experimental results show that the proposed method has qualitative characteristics that are suitable for the desired purpose, it outperforms random selection, and is comparable to using an equal amount of additional target domain data.
KSP Keywords
Active learning method, Data Augmentation, Latent variables, Learning Process, Qualitative characteristics, Random selection, Student Learning, Target domain, Transfer learning, active learning(AL), integrated system
This work is distributed under the term of Creative Commons License (CCL)
(CC BY)
CC BY