ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper MASP: Multi-Aspect Guided Emotion Reasoning with Soft Prompt Tuning in Vision-Language Models
Cited - time in scopus Share share facebook twitter linkedin kakaostory
Authors
SangEun Lee, Yubeen Lee, Eunil Park, Wonseok Chae
Issue Date
2026-01
Citation
The Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence (AAAI) 2026, pp.1-9
Publisher
Association for the Advancement of Artificial Intelligence
Language
English
Type
Conference Paper
Abstract
Understanding human emotions from images is a challenging yet essential task for vision-language models. While recent efforts have fine-tuned vision-language models to enhance emotional awareness, most approaches rely on global visual representations and fail to capture the nuanced and multifaceted nature of emotional cues. Furthermore, most existing approaches adopt instruction tuning, which requires costly dataset construction and involves training a large number of parameters, thereby limiting their scalability and efficiency. To address these challenges, we propose MASP, a novel framework for Multi-Aspect guided emotion reasoning with Soft Prompt tuning in vision-language models. MASP explicitly separates emotion-relevant visual cues via multi-aspect cross-attention modules and guides the language model using soft prompts, enabling efficient and scalable task adaptation without modifying the base model. Our method achieves state-of-the-art performance on various emotion recognition benchmarks, demonstrating that the explicit modeling of multi-aspect emotional cues with soft prompt tuning leads to more accurate and interpretable emotion reasoning in visionlanguage models.
KSP Keywords
Art performance, Dataset construction, Emotion Recognition, Emotional awareness, Explicit Modeling, Human emotions, Multi-Aspect, Visual cues, language models, state-of-The-Art, task adaptation