ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술지 Improving Statistical Machine Translation using Shallow Linguistic Knowledge
Cited 9 time in scopus Download 0 time Share share facebook twitter linkedin kakaostory
저자
황영숙, Andrew Finch, Yutaka Sasaki
발행일
200704
출처
Computer Speech and Language, v.21 no.2, pp.350-372
ISSN
0885-2308
출판사
Elsevier
DOI
https://dx.doi.org/10.1016/j.csl.2006.06.007
협약과제
06MW1900, 응용 특화 한중영 자동번역 기술개발, 박상규
초록
We describe methods for improving the performance of statistical machine translation (SMT) between four linguistically different languages, i.e., Chinese, English, Japanese, and Korean by using morphosyntactic knowledge. For the purpose of reducing the translation ambiguities and generating grammatically correct and fluent translation output, we address the use of shallow linguistic knowledge, that is: (1) enriching a word with its morphosyntactic features, (2) obtaining shallow linguistically-motivated phrase pairs, (3) iteratively refining word alignment using filtered phrase pairs, and (4) building a language model from morphosyntactically enriched words. Previous studies reported that the introduction of syntactic features into SMT models resulted in only a slight improvement in performance in spite of the heavy computational expense, however, this study demonstrates the effectiveness of morphosyntactic features, when reliable, discriminative features are used. Our experimental results show that word representations that incorporate morphosyntactic features significantly improve the performance of the translation model and language model. Moreover, we show that refining the word alignment using fine-grained phrase pairs is effective in improving system performance. © 2006 Elsevier Ltd. All rights reserved.
KSP 제안 키워드
Computational expense, Discriminative feature, Language model, Linguistic knowledge, Machine Translation(MT), Statistical Machine Translation, Syntactic features, System performance, Translation Model, Word Alignment, fine-grained