ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper SWAG-Net: Semantic Word-Aware Graph Network for Temporal Video Grounding
Cited 3 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Sunoh Kim, Taegil Ha, Kimin Yun, Jin Young Choi
Issue Date
2022-10
Citation
International Conference on Information and Knowledge Management (CIKM) 2022, pp.982-992
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1145/3511808.3557463
Abstract
In this paper, to effectively capture non-sequential dependencies among semantic words for temporal video grounding, we propose a novel framework called Semantic Word-Aware Graph Network (SWAG-Net), which adopts graph-guided semantic word embedding in an end-to-end manner. Specifically, we define semantic word features as node features of semantic word-aware graphs and word-to-word correlations as three edge types (i.e., intrinsic, extrinsic, and relative edges) for diverse graph structures. We then apply Semantic Word-aware Graph Convolutional Networks (SW-GCNs) to the graphs for semantic word embedding. For modality fusion and context modeling, the embedded features and video segment features are merged into bi-modal features, and the bi-modal features are aggregated by incorporating local and global contextual information. Leveraging the aggregated features, the proposed method effectively finds a temporal boundary semantically corresponding to a sentence query in an untrimmed video. We verify that our SWAG-Net outperforms state-of-the-art methods on Charades-STA and ActivityNet Captions datasets.
KSP Keywords
Bi-modal, Context Modeling, Contextual information, Convolutional networks, End to End(E2E), Graph networks, Graph-guided, Semantic word embedding, Temporal boundary, state-of-The-Art, video segment