ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article Rank‐weighted reconstruction feature for a robust deep neural network‐based acoustic model
Cited 3 time in scopus Download 19 time Share share facebook twitter linkedin kakaostory
Authors
Hoon Chung, Jeon Gue Park, Ho-Young Jung
Issue Date
2019-04
Citation
ETRI Journal, v.41, no.2, pp.235-241
ISSN
1225-6463
Publisher
한국전자통신연구원 (ETRI)
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.4218/etrij.2018-0189
Abstract
In this paper, we propose a rank-weighted reconstruction feature to improve the robustness of a feed-forward deep neural network (FFDNN)-based acoustic model. In the FFDNN-based acoustic model, an input feature is constructed by vectorizing a submatrix that is created by slicing the feature vectors of frames within a context window. In this type of feature construction, the appropriate context window size is important because it determines the amount of trivial or discriminative information, such as redundancy, or temporal context of the input features. However, we ascertained whether a single parameter is sufficiently able to control the quantity of information. Therefore, we investigated the input feature construction from the perspectives of rank and nullity, and proposed a rank-weighted reconstruction feature herein, that allows for the retention of speech information components and the reduction in trivial components. The proposed method was evaluated in the TIMIT phone recognition and Wall Street Journal (WSJ) domains. The proposed method reduced the phone error rate of the TIMIT domain from 18.4% to 18.0%, and the word error rate of the WSJ domain from 4.70% to 4.43%.
KSP Keywords
Deep neural network(DNN), Discriminative information, Feature Vector, Feed-forward Deep Neural Network, Input features, Phone Error Rate, Single parameter, Speech information, Wall Street, Window Size, acoustic model