ETRI Knowledge Sharing Platform : Learning a Video-Text Joint Embedding using Korean Tagged Movie Clips

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper Learning a Video-Text Joint Embedding using Korean Tagged Movie Clips

Cited 0 time in scopus

Citation: International Conference on Information and Communication Technology Convergence (ICTC) 2020, pp.1158-1160

Abstract: For intelligent multimedia services, video contents understanding is a major challenge. In the existing video retrieval approaches, manual descriptive sentence data is necessary for retrieving desired videos against user's search intent. To overcome these limitations, modeling visual concepts included in video and sentence is necessary to learn a mapping of video and text into a common vector space, where relevant videos and texts are close to each other. In this study, we construct a new dataset containing 250 Korean movies with manual text description in Korean. Also, video-text joint embedding model and its quantitative and qualitative search results are introduced. With our proposed model, video manual tagging is no longer necessary for video retrieval services.

KSP Keywords: Embedding model, Intelligent Multimedia, Multimedia Service, Proposed model, Search intent, Search results, Video and text, Video content, Video retrieval, vector space, visual concepts

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.