ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article Improving Visual Relationship Detection using Linguistic and Spatial Cues
Cited 5 time in scopus Download 170 time Share share facebook twitter linkedin kakaostory
Authors
Jaewon Jung, Jongyoul Park
Issue Date
2020-06
Citation
ETRI Journal, v.42, no.3, pp.399-410
ISSN
1225-6463
Publisher
한국전자통신연구원 (ETRI)
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.4218/etrij.2019-0093
Abstract
Detecting visual relationships in an image is important in an image understanding task. It enables higher image understanding tasks, that is, predicting the next scene and understanding what occurs in an image. A visual relationship comprises of a subject, a predicate, and an object, and is related to visual, language, and spatial cues. The predicate explains the relationship between the subject and object and can be categorized into different categories such as prepositions and verbs. A large visual gap exists although the visual relationship is included in the same predicate. This study improves upon a previous study (that uses language cues using two losses) and a spatial cue (that only includes individual information) by adding relative information on the subject and object of the extant study. The architectural limitation is demonstrated and is overcome to detect all zero-shot visual relationships. A new problem is discovered, and an explanation of how it decreases performance is provided. The experiment is conducted on the VRD and VG datasets and a significant improvement over previous results is obtained.
KSP Keywords
Relative information, Spatial cue, Visual relationship detection, Zero-shot, image understanding
This work is distributed under the term of Korea Open Government License (KOGL)
(Type 4: : Type 1 + Commercial Use Prohibition+Change Prohibition)
Type 4: