ETRI Knowledge Sharing Platform : Word-piece based natural language recognition for multimodal object retrieval

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper Word-piece based natural language recognition for multimodal object retrieval

Cited 0 time in scopus

Citation: International Conference on Control, Automation and Systems (ICCAS 2023), pp.905-908

Abstract: This paper proposes a word-piece based natural language recognition for multimodal object retrieval. Efforts are being made to enable flexible interaction between humans and robots by incorporating image and natural language-related multimodal AI technologies. This includes research on retrieving appropriate images even when natural language does not explicitly include words representing images. However, the lack of clear words in natural language corresponding to images poses challenges in multimodal object retrieval. In this paper, a proposed approach aims to enhance the performance of multimodal object retrieval through word-piece embedding-based natural language recognition (NLR). An experiment was conducted on multimodal object retrieval using word-piece embedding-based NLR. It demonstrated superior performance compared to word embedding. For example, with a data size of 1070, the top-2 retrieval accuracy achieved through word-piece embedding-based NLR was 0.75, while the top-2 retrieval accuracy achieved through word embedding-based NLR was 0.69.

KSP Keywords: Data size, Flexible interaction, Language Recognition, Natural language, Word Embedding, object retrieval, retrieval accuracy, superior performance

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.