ETRI-Knowledge Sharing Plaform



논문 검색
구분 SCI
연도 ~ 키워드


학술대회 Phonetic State Relation Graph Regularized Deep Neural Network for Robust Acoustic Model
Cited 0 time in scopus Download 0 time Share share facebook twitter linkedin kakaostory
정훈, 오유리, 이성주, 박전규
International Joint Conference on Neural Networks (IJCNN) 2017, pp.3081-3085
17HS5700, 언어학습을 위한 자유발화형 음성대화처리 원천기술 개발, 이윤근
In this paper, we propose a phonetic state relation graph regularized Deep Neural Network (DNN) for a robust acoustic model. A DNN-based acoustic model is trained in terms of minimizing a cost function that is usually penalized by regularizations. Regularization generally reflects prior knowledge that plays a role in constraining the model parameter space. In DNN-based acoustic models, various regularizations have been proposed to improve robustness. However, most approaches do not handle speech generation knowledge even if this process is the most fundamental prior. For example, l1 and l2-norm regularizations are equivalent to set Gaussian prior and Laplacian prior to model parameters respectively. This means that any speech signal specific knowledge is not used for regularization. Manifold-based regularization exploits the local linear structure of observed acoustic features, which are simply realization of the speech generation process. Therefore, to incorporate prior knowledge of speech generation into regularization, we propose a phonetic state relation graph based approach. This method was evaluated on the TIMIT phone recognition domain. The results showed that it reduced the phone error rate from 20.8% to 20.3% under the same conditions.
KSP 제안 키워드
Cost Function, DNN-based acoustic model, Deep neural network(DNN), Graph regularized, Graph-based Approach, L1 and L2-norm, Linear structure, Local linear, Model parameter, Phone Error Rate, Speech Signal