ETRI Knowledge Sharing Platform : Multimodal Alzheimer’s disease recognition from image, text and audio

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Journal Article Multimodal Alzheimer’s disease recognition from image, text and audio

Cited 1 time in scopus

Download 58 time Share share

Abstract: Alzheimer’s disease (AD) is a progressive neurodegenerative disorder that significantly affects cognitive function. One widely used diagnostic approach involves analyzing patients’ verbal descriptions of pictures. While prior studies have primarily focused on speech- and text-based models, the integration of visual context is still at an early stage. This study proposes a novel multimodal AD prediction model that integrates image, text, and audio modalities. The image and text modalities are processed using a vision-language model and structured as a bipartite graph before fusion, while all three modalities are integrated through a combination of co-attention-based intermediate fusion and late fusion, enabling effective inter-modality cooperation. The proposed model achieves an accuracy of 90.61%, outperforming state-of-the-art models. Furthermore, an ablation study quantifies the contribution of each modality using Shapley values, which serve as the foundation for a novel auxiliary loss function that adaptively adjusts modality importance during training. The findings indicate that integrating image, text, and audio modalities via a co-attention-based intermediate fusion enhances AD classification performance. Additionally, this study analyzes modality-specific attention patterns and key linguistic tokens, demonstrating that audio and text provide complementary cues for AD classification.

KSP Keywords: AD classification, Bipartite graph, Classification Performance, Cognitive function, Disease recognition, Inter-modality, Proposed model, Shapley values, Visual Context, language models, late fusion

This work is distributed under the term of Creative Commons License (CCL)
(CC BY NC ND)

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.