ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술대회 Improving End-To-End Speech Translation Model with Bert-Based Contextual Information
Cited 1 time in scopus Download 12 time Share share facebook twitter linkedin kakaostory
저자
방정욱, 이민규, 윤승, 김상훈
발행일
202205
출처
International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022, pp.6277-6231
DOI
https://dx.doi.org/10.1109/ICASSP43922.2022.9746117
협약과제
21ZS1100, 자율성장형 복합인공지능 원천기술 연구, 송화전
초록
This paper proposes an end-to-end speech translation system that utilizes contextual information. Contextual information helps clarify the meaning of the utterances. However, conventional end-to-end speech translation (E2E-ST) is primarily designed to handle single-utterance. Thus, we introduce a context encoder that extracts contextual information from previous translation results. Here, the context encoder obtains high-quality contextual information by adopting the BERT model. Then, we combine it with speech information extracted from speech signals to generate translation results. On the widely used TED-based speech translation corpus, we show that the results of the contextual E2E-ST model are significantly better than those of the single utterance-based E2E-ST model. Furthermore, we demonstrate that contextual information contributes to the processing of unclearly spoken utterances as well as ambiguity caused by pronouns and homophones.
KSP 제안 키워드
Contextual information, End to End(E2E), High-quality, Speech Signals, Speech information, Speech translation, Translation Model, Translation system