ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술지 Speech Recognition for Task Domains with Sparse Matched Training Data
Cited 3 time in scopus Download 198 time Share share facebook twitter linkedin kakaostory
저자
강병옥, 전형배, 박전규
발행일
202009
출처
Applied Sciences, v.10 no.18, pp.1-15
ISSN
2076-3417
출판사
MDPI
DOI
https://dx.doi.org/10.3390/app10186155
협약과제
20HS1700, 준지도학습형 언어지능 원천기술 및 이에 기반한 외국인 지원용 한국어 튜터링 서비스 개발, 이윤근
초록
We propose two approaches to handle speech recognition for task domains with sparse matched training data. One is an active learning method that selects training data for the target domain from another general domain that already has a significant amount of labeled speech data. This method uses attribute-disentangled latent variables. For the active learning process, we designed an integrated system consisting of a variational autoencoder with an encoder that infers latent variables with disentangled attributes from the input speech, and a classifier that selects training data with attributes matching the target domain. The other method combines data augmentation methods for generating matched target domain speech data and transfer learning methods based on teacher/student learning. To evaluate the proposed method, we experimented with various task domains with sparse matched training data. The experimental results show that the proposed method has qualitative characteristics that are suitable for the desired purpose, it outperforms random selection, and is comparable to using an equal amount of additional target domain data.
KSP 제안 키워드
Active learning method, Data Augmentation, Integrated system, Latent Variable, Qualitative characteristics, Student Learning, Target domain, Transfer learning, active learning(AL), learning process, random selection
본 저작물은 크리에이티브 커먼즈 저작자 표시 (CC BY) 조건에 따라 이용할 수 있습니다.
저작자 표시 (CC BY)