ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper A Preliminary Study on Topical Model for Multi-domain Speech Recognition via Word Embedding Vector
Cited 2 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Jihye Moon, Seung Yun, Damheo Lee, Sanghun Kim
Issue Date
2019-06
Citation
International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC) 2019, pp.1-4
Publisher
IEEE
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1109/ITC-CSCC.2019.8793299
Abstract
In this paper, we suggest a basic topical model(TM) framework to adapt speech recognition system to multi-domain and prevent topical errors. This paper employs the cosine similarities between target and context words at a spoken utterance as the topical model parameters. The TM is applied to frames having a large number of candidate words at lattice network, and it adjusts the ranking of candidate words by adding it to total cost estimated from acoustic model(AM) and language model(LM). To cover multidomain, the word embedding was trained with 5.5 billion text corpus from multi-domain. As an acoustic model and a language model, DNN-HMM and N - gram were selected. 501 sentences (10,054 words) includes 35 topics were used as an evaluation data set. As a result, the best performances were obtained by our approach, and the performance of WERR was increased up to about 4% compared with N-gram based model. The WERR increased above 10% when the word errors were correctly detected. The results show this suggestion has a possibility of adapting a model to multi-domains without sub-topic models.
KSP Keywords
DNN-HMM, Data sets, Language Model, Model parameter, Multi-Domain, Preliminary study, Speech recognition system, Word Embedding, acoustic model, lattice network, n-Gram