ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper Improving End-To-End Speech Translation Model with Bert-Based Contextual Information
Cited 4 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Jeong-Uk Bang, Min-Kyu Lee, Seung Yun, Sang-Hun Kim
Issue Date
2022-05
Citation
International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022, pp.6277-6231
Publisher
IEEE
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1109/ICASSP43922.2022.9746117
Abstract
This paper proposes an end-to-end speech translation system that utilizes contextual information. Contextual information helps clarify the meaning of the utterances. However, conventional end-to-end speech translation (E2E-ST) is primarily designed to handle single-utterance. Thus, we introduce a context encoder that extracts contextual information from previous translation results. Here, the context encoder obtains high-quality contextual information by adopting the BERT model. Then, we combine it with speech information extracted from speech signals to generate translation results. On the widely used TED-based speech translation corpus, we show that the results of the contextual E2E-ST model are significantly better than those of the single utterance-based E2E-ST model. Furthermore, we demonstrate that contextual information contributes to the processing of unclearly spoken utterances as well as ambiguity caused by pronouns and homophones.
KSP Keywords
Contextual information, End to End(E2E), High-quality, Speech Signals, Speech information, Speech translation, Translation Model, Translation system