ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper See More, Store Less: Memory-Efficient Resolution for Video Moment Retrieval
Cited - time in scopus Download 38 time Share share facebook twitter linkedin kakaostory
Authors
Mingyu Jeon, Sungjin Han, Jinkwon Hwang, Minchol Kwon, Jonghee Kim, Junyeong Kim
Issue Date
2026-03
Citation
Findings of the Association for Computational Linguistics: EACL 2026, pp.1726-1736
Publisher
ACL
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.18653/v1/2026.findings-eacl.87
Abstract
Recent advances in Multimodal Large Language Models (MLLMs) have improved image recognition and reasoning, but video-related tasks remain challenging due to memory constraints from dense frame processing. Existing Video Moment Retrieval (VMR) methodologies rely on sparse frame sampling, risking potential information loss, especially in lengthy videos. We propose SMORE (See MORE, store less), a framework that enhances memory efficiency while maintaining high information resolution. SMORE (1) uses query-guided captions to encode semantics aligned with user intent, (2) applies query-aware importance modulation to highlight relevant segments, and (3) adaptively compresses frames to preserve key content while reducing redundancy. This enables efficient video understanding without exceeding memory budgets. Experimental validation reveals that SMORE achieves state-of-the-art performance on QVHighlights, Charades-STA, and ActivityNet-Captions benchmarks.
KSP Keywords
Art performance, Frame processing, Image recognition, Information loss, Memory Efficiency, Potential information, User Intent, experimental validation, language models, memory-efficient, state-of-The-Art
This work is distributed under the term of Creative Commons License (CCL)
(CC BY)
CC BY