ETRI Knowledge Sharing Platform : Stochastic Video Prediction with Perceptual Loss

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper Stochastic Video Prediction with Perceptual Loss

Cited - time in scopus

Citation: Conference on Neural Information Processing Systems (NeurIPS) 2021 : Workshop, pp.1-8

Abstract: Predicting future states is a challenging process in the decision-making system because of its inherently uncertain nature. Most works in this literature are based on deep generative networks such as variational autoencoder which uses pixelwise reconstruction in their loss functions. Predicting the future with pixel-wise reconstruction could fail to capture the full distribution of high-level representations and result in inaccurate and blurred predictions. In this paper, we propose stochastic video generation with perceptual loss (SVG-PL) to improve uncertainty and blurred area in future prediction. The proposed model combines perceptual loss function and pixel-wise loss function for image reconstruction and future state predictions. The model is built on a variational autoencoder to reduce high dimensionality to latent variable to capture both spatial information and temporal dynamics of future prediction. We show that utilization of perceptual loss on video prediction improves reconstruction ability and result in clear predictions. Improvements in video prediction could further help the decision-making process in multiple downstream applications.

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.