ETRI Knowledge Sharing Platform : Three‐stream network with context convolution module for human

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Journal Article Three‐stream network with context convolution module for human–object interaction detection

Cited 4 time in scopus

Download 74 time Share share

Abstract: Human?뱋bject interaction (HOI) detection is a popular computer vision task that detects interactions between humans and objects. This task can be useful in many applications that require a deeper understanding of semantic scenes. Current HOI detection networks typically consist of a feature extractor followed by detection layers comprising small filters (eg, 1혻×혻1 or 3혻×혻3). Although small filters can capture local spatial features with a few parameters, they fail to capture larger context information relevant for recognizing interactions between humans and distant objects owing to their small receptive regions. Hence, we herein propose a three-stream HOI detection network that employs a context convolution module (CCM) in each stream branch. The CCM can capture larger contexts from input feature maps by adopting combinations of large separable convolution layers and residual-based convolution layers without increasing the number of parameters by using fewer large separable filters. We evaluate our HOI detection method using two benchmark datasets, V-COCO and HICO-DET, and demonstrate its state-of-the-art performance.

KSP Keywords: Art performance, Benchmark datasets, Computer Vision(CV), Context Information, Detection Method, Feature map, Interaction detection, Separable Filters, Stream network, feature extractor, object interaction

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.