ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article Alzheimer’s disease recognition using graph neural network by leveraging image-text similarity from vision language model
Cited 2 time in scopus Download 1653 time Share share facebook twitter linkedin kakaostory
Authors
Byounghwa Lee, Jeong-Uk Bang, Hwa Jeon Song, Byung Ok Kang
Issue Date
2025-01
Citation
Scientific Reports, v.15, pp.1-14
ISSN
2045-2322
Publisher
Nature Publishing Group
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.1038/s41598-024-82597-z
Abstract
Alzheimer’s disease (AD), a progressive neurodegenerative condition, notably impacts cognitive functions and daily activity. One method of detecting dementia involves a task where participants describe a given picture, and extensive research has been conducted using the participants’ speech and transcribed text. However, very few studies have explored the modality of the image itself. In this work, we propose a method that predicts dementia automatically by representing the relationship between images and texts as a graph. First, we transcribe the participants’ speech into text using an automatic speech recognition system. Then, we employ a vision language model to represent the relationship between the parts of the image and the corresponding descriptive sentences as a bipartite graph. Finally, we use a graph convolutional network (GCN), considering each subject as an individual graph, to classify AD patients through a graph-level classification task. In experiments conducted on the ADReSSo Challenge datasets, our model surpassed the existing state-of-the-art performance by achieving an accuracy of 88.73%. Additionally, ablation studies that removed the relationship between images and texts demonstrated the critical role of graphs in improving performance. Furthermore, by utilizing the sentence representations learned through the GCN, we identified the sentences and keywords critical for AD classification.
KSP Keywords
AD classification, Art performance, Bipartite graph, Classification task, Cognitive function, Convolutional networks, Daily activities, Disease recognition, Image-text, Language Model, Level classification
This work is distributed under the term of Creative Commons License (CCL)
(CC BY NC ND)
CC BY NC ND