ETRI Knowledge Sharing Platform : Phonetic State Relation Graph Regularized Deep Neural Network for Robust Acoustic Model

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper Phonetic State Relation Graph Regularized Deep Neural Network for Robust Acoustic Model

Cited 0 time in scopus

Citation: International Joint Conference on Neural Networks (IJCNN) 2017, pp.3081-3085

Abstract: In this paper, we propose a phonetic state relation graph regularized Deep Neural Network (DNN) for a robust acoustic model. A DNN-based acoustic model is trained in terms of minimizing a cost function that is usually penalized by regularizations. Regularization generally reflects prior knowledge that plays a role in constraining the model parameter space. In DNN-based acoustic models, various regularizations have been proposed to improve robustness. However, most approaches do not handle speech generation knowledge even if this process is the most fundamental prior. For example, l1 and l2-norm regularizations are equivalent to set Gaussian prior and Laplacian prior to model parameters respectively. This means that any speech signal specific knowledge is not used for regularization. Manifold-based regularization exploits the local linear structure of observed acoustic features, which are simply realization of the speech generation process. Therefore, to incorporate prior knowledge of speech generation into regularization, we propose a phonetic state relation graph based approach. This method was evaluated on the TIMIT phone recognition domain. The results showed that it reduced the phone error rate from 20.8% to 20.3% under the same conditions.

KSP Keywords: Cost Function, DNN-based acoustic model, Deep neural network(DNN), Graph regularized, L1 and L2-norm, Local linear, Model parameter, Phone Error Rate, Speech Signals, Speech generation, acoustic features

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.