ETRI Knowledge Sharing Platform : CMVDE: Consistent Multi-View Video Depth Estimation via Geometric-Temporal Coupling Approach

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Journal Article CMVDE: Consistent Multi-View Video Depth Estimation via Geometric-Temporal Coupling Approach

Cited 0 time in scopus

Authors: Hosung Son, Min-jung Shin, Minji Cho, Joonsoo Kim, Kug-jin Yun, Suk-Ju Kang

Abstract: In the field of video depth estimation, significant strides have been made with deep learning-based multi-view stereo approaches. However, existing studies struggle to produce consistently accurate depth maps that account for both multi-view geometry and temporal consistency from monocular video contents. To overcome this limitation, we introduce CMVDE, an innovative video depth estimation framework that leverages a multi-view geometric-temporal coupling approach in an end-to-end manner. Our proposed geometric consistency module efficiently generates multi-view geometric features by employing mutual cross-view epipolar attention between adjacent video frames. Additionally, it compresses these features using the novel multi-scale feature compressor, producing an effective input tensor for the subsequent module. Moreover, our framework enhances temporal consistency across consecutive video frames with the temporal consistency module based on convolutional LSTM [1] leveraging previous depth information as geometric guidance. Compared to state-of-the-art models, our approach achieves superior performance in depth quality and consecutive consistency on the ScanNet [2] and 7-Scenes [3] datasets, surpassing previous multi-view video depth estimation methods.

KSP Keywords: Cross-view, Depth Map, Depth estimation, Depth information, End to End(E2E), Geometric features, Learning-based, Multi-scale feature, Multi-view stereo, Multiview video, Temporal coupling

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.