ETRI Knowledge Sharing Platform : Rank‐weighted reconstruction feature for a robust deep neural network‐based acoustic model

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Journal Article Rank‐weighted reconstruction feature for a robust deep neural network‐based acoustic model

Cited 3 time in scopus

Download 72 time Share share

Abstract: In this paper, we propose a rank-weighted reconstruction feature to improve the robustness of a feed-forward deep neural network (FFDNN)-based acoustic model. In the FFDNN-based acoustic model, an input feature is constructed by vectorizing a submatrix that is created by slicing the feature vectors of frames within a context window. In this type of feature construction, the appropriate context window size is important because it determines the amount of trivial or discriminative information, such as redundancy, or temporal context of the input features. However, we ascertained whether a single parameter is sufficiently able to control the quantity of information. Therefore, we investigated the input feature construction from the perspectives of rank and nullity, and proposed a rank-weighted reconstruction feature herein, that allows for the retention of speech information components and the reduction in trivial components. The proposed method was evaluated in the TIMIT phone recognition and Wall Street Journal (WSJ) domains. The proposed method reduced the phone error rate of the TIMIT domain from 18.4% to 18.0%, and the word error rate of the WSJ domain from 4.70% to 4.43%.

KSP Keywords: Deep neural network(DNN), Discriminative information, Feature Vector, Feed-forward Deep Neural Network, Input features, Phone Error Rate, Single parameter, Speech information, Wall Street, Window Size, Word Error Rate

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.