ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article Language Model Adaptation Based on Topic Probability of Latent Dirichlet Allocation
Cited 16 time in scopus Download 42 time Share share facebook twitter linkedin kakaostory
Authors
Hyung-Bae Jeon, Soo-Young Lee
Issue Date
2016-06
Citation
ETRI Journal, v.38, no.3, pp.487-493
ISSN
1225-6463
Publisher
한국전자통신연구원 (ETRI)
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.4218/etrij.16.0115.0499
Abstract
Two new methods are proposed for an unsupervised adaptation of a language model (LM) with a single sentence for automatic transcription tasks. At the training phase, training documents are clustered by a method known as Latent Dirichlet allocation (LDA), and then a domain-specific LM is trained for each cluster. At the test phase, an adapted LM is presented as a linear mixture of the now trained domain-specific LMs. Unlike previous adaptation methods, the proposed methods fully utilize a trained LDA model for the estimation of weight values, which are then to be assigned to the now trained domainspecific LMs; therefore, the clustering and weightestimation algorithms of the trained LDA model are reliable. For the continuous speech recognition benchmark tests, the proposed methods outperform other unsupervised LM adaptation methods based on latent semantic analysis, non-negative matrix factorization, and LDA with n-gram counting.
KSP Keywords
Benchmark test, Continuous Speech Recognition, Domain-specific, LDA Model, LM adaptation, Latent dirichlet allocation (lda), Latent semantic analysis, automatic transcription, language model adaptation, n-Gram, non-negative matrix factorization(NMF)