ETRI Knowledge Sharing Platform : Multilingual, Not Multicultural: Uncovering the Cultural Empathy Gap in LLMs through a Comparative Empathetic Dialogue Benchmark

BROWSE

Titles

논문 검색
Type		SCI
Year	~	Keyword

Detail

List

Conference Paper Multilingual, Not Multicultural: Uncovering the Cultural Empathy Gap in LLMs through a Comparative Empathetic Dialogue Benchmark

Cited - time in scopus

Download 7 time Share share

Authors: Woojin Lee, Yujin Sim, Hongjin Kim, Harksoo Kim

Issue Date: 2025-12

Citation: International Joint Conference on Natural Language Processing and Asia-Pacific Chapter of the Association for Computational Linguistics (IJCNLP-AACL) 2025, pp.791-809

Publisher: The Asian Federation of Natural Language Processing and The Association for Computational Linguistics

Language: English

Type: Conference Paper

DOI: https://dx.doi.org/10.18653/v1/2025.ijcnlp-long.44

Abstract: Large Language Models (LLMs) demonstrate remarkable multilingual capabilities, yet it remains unclear whether they are truly multicultural. Do they merely process different languages, or can they genuinely comprehend the unique cultural contexts embedded within them? This study investigates this critical question by examining whether LLM’s perception of emotion and empathy differs across linguistic and cultural boundaries. To facilitate this, we introduce the Korean Empathetic Dialogues (KoED), a benchmark extending the English-based EmpatheticDialogues (ED) dataset. Moving beyond direct translation, we meticulously reconstructed dialogues specifically selected for their potential for cultural adaptation, aligning them with Korean emotional nuances and incorporating key cultural concepts like ‘jeong’ and ‘han’ that lack direct English equivalents. Our cross-cultural evaluation of leading multilingual LLMs reveals a significant “cultural empathy gap”: models consistently underperform on KoED compared to ED, struggling especially with uniquely Korean emotional expressions. Notably, the Korean-centric model, EXAONE, exhibits significantly higher cultural appropriateness. This result provides compelling evidence that aligns with the “data provenance effect”, suggesting that the cultural alignment of pre-training data is a critical factor for genuine empathetic communication. These findings demonstrate that current LLMs have cultural blind spots and underscore the necessity of benchmarks like KoED to move beyond simple linguistic fluency towards truly culturally adaptive AI systems.

KSP Keywords: Adaptive AI, Blind spot, Critical factors, Cross-cultural evaluation, Emotional expression, Pre-Training, data provenance, language models, training data

This work is distributed under the term of Creative Commons License (CCL)
(CC BY)

ETRI-Knowledge Sharing Plaform

BROWSE

Titles

Detail

ETRI