ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술지 Fast Speaker Adaptation using Extended Diagonal Linear Transformation for Deep Neural Networks
Cited 2 time in scopus Download 107 time Share share facebook twitter linkedin kakaostory
저자
김동현, 김상훈
발행일
201902
출처
ETRI Journal, v.41 no.1, pp.109-116
ISSN
1225-6463
출판사
한국전자통신연구원 (ETRI)
DOI
https://dx.doi.org/10.4218/etrij.2017-0087
협약과제
17HS1700, 지식증강형 실시간 동시통역 원천기술 개발, 김영길
초록
This paper explores new techniques that are based on a hidden-layer linear transformation for fast speaker adaptation used in deep neural networks (DNNs). Conventional methods using affine transformations are ineffective because they require a relatively large number of parameters to perform. Meanwhile, methods that employ singular-value decomposition (SVD) are utilized because they are effective at reducing adaptive parameters. However, a matrix decomposition is computationally expensive when using online services. We propose the use of an extended diagonal linear transformation method to minimize adaptation parameters without SVD to increase the performance level for tasks that require smaller degrees of adaptation. In Korean large vocabulary continuous speech recognition (LVCSR) tasks, the proposed method shows significant improvements with error-reduction rates of 8.4% and 17.1% in five and 50 conversational sentence adaptations, respectively. Compared with the adaptation methods using SVD, there is an increased recognition performance with fewer parameters.
KSP 제안 키워드
Adaptive parameter, Affine Transformation, Conventional methods, Deep neural network(DNN), Linear transformation, Matrix decomposition, Online services, Performance levels, Reduction rate, Transformation method, computationally expensive
본 저작물은 공공누리 제4유형 : 출처표시 + 상업적 이용금지 + 변경금지 조건에 따라 이용할 수 있습니다.
제4유형