ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper HiCM2: Hierarchical Compact Memory Modeling for Dense Video Captioning
Cited 1 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Minkuk Kim, Hyeon Bae Kim, Jinyoung Moon, Jinwoo Choi, Seong Tae Kim
Issue Date
2025-02
Citation
The Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence (AAAI) 2025, pp.4293-4301
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1609/aaai.v39i4.32451
Abstract
With the growing demand for solutions to real-world video challenges, interest in dense video captioning (DVC) has been on the rise. DVC involves the automatic captioning and localization of untrimmed videos. Several studies highlight the challenges of DVC and introduce improved methods utilizing prior knowledge, such as pre-training and external memory. In this research, we propose a model that leverages the prior knowledge of human-oriented hierarchical compact memory inspired by human memory hierarchy and cognition. To mimic human-like memory recall, we construct a hierarchical memory and a hierarchical memory reading module. We build an efficient hierarchical compact memory by employing clustering of memory events and summarization using large language models. Comparative experiments demonstrate that this hierarchical memory recall process improves the performance of DVC by achieving state-of-the-art performance on YouCook2 and ViTT datasets.
KSP Keywords
Art performance, Hierarchical memory, Human memory, Human-like, Improved method, Memory Hierarchy, Memory recall, Pre-Training, Real-world, Video Captioning, external memory