ETRI Knowledge Sharing Platform : Generative Scene Graph Networks

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper Generative Scene Graph Networks

Cited 25 time in scopus

Citation: International Conference on Learning Representations (ICLR) 2021, pp.1-23

Abstract: Human perception excels at building compositional hierarchies of parts and objects from unlabeled scenes that help systematic generalization. Yet most work on generative scene modeling either ignores the part-whole relationship or assumes access to predeﬁned part labels. In this paper, we propose Generative Scene Graph Networks (GSGNs), the ﬁrst deep generative model that learns to discover the primitive parts and infer the part-whole relationship jointly from multi-object scenes without supervision and in an end-to-end trainable way. We formulate GSGN as a variational autoencoder in which the latent representation is a treestructured probabilistic scene graph. The leaf nodes in the latent tree correspond to primitive parts, and the edges represent the symbolic pose variables required for recursively composing the parts into whole objects and then the full scene. This allows novel objects and scenes to be generated both by sampling from the prior and by manual conﬁguration of the pose variables, as we do with graphics engines. We evaluate GSGN on datasets of scenes containing multiple compositional objects, including a challenging Compositional CLEVR dataset that we have developed. We show that GSGN is able to infer the latent scene graph, generalize out of the training regime, and improve data efﬁciency in downstream tasks.

KSP Keywords: Deep generative model, End to End(E2E), End-to-end trainable, Graph networks, Multi-object, Part-whole relationship, Scene graph, Scene modeling, human perception

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.