ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article Blending 3D geometry and machine learning for multi-view stereopsis
Cited 0 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Vibhas Vats, Md Alimoor Reza, David Crandall, Soon-heung Jung
Issue Date
2025-11
Citation
Neurocomputing, v.655, pp.1-12
ISSN
0925-2312
Publisher
Elsevier
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.1016/j.neucom.2025.131250
Abstract
Traditional multi-view stereo (MVS) methods primarily depend on photometric and geometric consistency constraints. In contrast, modern learning-based algorithms often rely on the plane sweep algorithm to infer 3D geometry, applying explicit geometric consistency (GC) checks only as a post-processing step, with no impact on the learning process itself. In this work, we introduce GC-MVSNet++, a novel approach that actively enforces geometric consistency of reference view depth maps across multiple source views (multi-view) and at various scales (multi-scale) during the learning phase (see Fig. 1). This integrated GC check significantly accelerates the learning process by directly penalizing geometrically inconsistent pixels, effectively halving the number of training iterations compared to other MVS methods. Furthermore, we introduce a densely connected cost regularization network with two distinct block designs – simple and feature-dense – optimized to harness dense feature connections for enhanced regularization. Extensive experiments demonstrate that our approach achieves a new state-of-the-art on the BlendedMVS dataset, and competitive performance on the DTU and Tanks and Temples benchmarks. To our knowledge, GC-MVSNet++ is among the few approaches that enforce supervised geometric consistency across multiple views and at multiple scales during training. Our code is available at https://github.com/vkvats/GC-MVSNet-PlusPlus.
KSP Keywords
3D geometry, Competitive performance, Consistency constraints, Depth Map, Learning Process, Learning-based algorithms, Multi-scale, Multi-view stereo, Multiple Scales, Multiple sources, Multiple views