ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article Multi-view Human Mesh Reconstruction Via Direction-aware Feature Fusion
Cited 0 time in scopus Download 169 time Share share facebook twitter linkedin kakaostory
Authors
Chandeul Song, Gi-Mun Um, Won-Sik Cheong, Wonjun Kim
Issue Date
2024-11
Citation
IEEE Access, v.12, pp.160254-160266
ISSN
2169-3536
Publisher
Institute of Electrical and Electronics Engineers Inc.
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.1109/ACCESS.2024.3488038
Abstract
Although there are many advantages (e.g., occlusion-robust properties) to be gained by using multi-view inputs for human mesh reconstruction, relatively few studies have been conducted due to the complexity of the fusion process. In this paper, we delve into the method to accurately fuse features encoded from multi-view inputs. The key idea of the proposed method is to combine multi-view image features by adopting the self-attention mechanism with directional encoding. Specifically, backbone features obtained from each camera viewpoint are encoded into pose and shape features along with the directional vector, which is defined by the positional difference between each camera and the target subject. Such encoded features are fed into the transformer to generate the fused feature through the self-attention operation. During this process, the dimensions of the directional vector are expanded and incorporated into the encoder, along with pose and shape features. Given the token embeddings, key and query matrices undergo matrix multiplication to calculate the correlations between all the viewpoints. These correlation scores are then used to adaptively generate features fused from different viewpoints. Moreover, we use these fused features to interact with global tokens, which represent learnable pose and shape embeddings for the unified mesh model. These tokens progressively recalibrate global pose and shape features of the target subject in a cross-attention process. Experimental results on multi-view benchmark datasets demonstrate the effectiveness of the proposed method.
KSP Keywords
Attention mechanism, Benchmark datasets, Direction-aware, Feature fusion, Fused features, Fusion process, Image Features, MESH model, Mesh reconstruction, camera viewpoint, matrix multiplication
This work is distributed under the term of Creative Commons License (CCL)
(CC BY NC ND)
CC BY NC ND