ETRI Knowledge Sharing Platform : Multi-view Human Mesh Reconstruction Via Direction-aware Feature Fusion

BROWSE

Titles

논문 검색
Type		SCI
Year	~	Keyword

Detail

List

Journal Article Multi-view Human Mesh Reconstruction Via Direction-aware Feature Fusion

Cited 0 time in scopus

Download 1045 time Share share

Authors: Chandeul Song, Gi-Mun Um, Won-Sik Cheong, Wonjun Kim

Issue Date: 2024-11

Citation: IEEE Access, v.12, pp.160254-160266

ISSN: 2169-3536

Publisher: Institute of Electrical and Electronics Engineers Inc.

Language: English

Type: Journal Article

DOI: https://dx.doi.org/10.1109/ACCESS.2024.3488038

Abstract: Although there are many advantages (e.g., occlusion-robust properties) to be gained by using multi-view inputs for human mesh reconstruction, relatively few studies have been conducted due to the complexity of the fusion process. In this paper, we delve into the method to accurately fuse features encoded from multi-view inputs. The key idea of the proposed method is to combine multi-view image features by adopting the self-attention mechanism with directional encoding. Specifically, backbone features obtained from each camera viewpoint are encoded into pose and shape features along with the directional vector, which is defined by the positional difference between each camera and the target subject. Such encoded features are fed into the transformer to generate the fused feature through the self-attention operation. During this process, the dimensions of the directional vector are expanded and incorporated into the encoder, along with pose and shape features. Given the token embeddings, key and query matrices undergo matrix multiplication to calculate the correlations between all the viewpoints. These correlation scores are then used to adaptively generate features fused from different viewpoints. Moreover, we use these fused features to interact with global tokens, which represent learnable pose and shape embeddings for the unified mesh model. These tokens progressively recalibrate global pose and shape features of the target subject in a cross-attention process. Experimental results on multi-view benchmark datasets demonstrate the effectiveness of the proposed method.

KSP Keywords: Attention mechanism, Benchmark datasets, Direction-aware, Feature fusion, Fused features, Fusion process, Image Features, MESH model, Mesh reconstruction, camera viewpoint, matrix multiplication

This work is distributed under the term of Creative Commons License (CCL)
(CC BY NC ND)

ETRI-Knowledge Sharing Plaform

BROWSE

Titles

Detail

ETRI