ETRI Knowledge Sharing Platform : Evaluation of LLM-Based VQA Models for Enhancing Battlefield Situation Awareness

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper Evaluation of LLM-Based VQA Models for Enhancing Battlefield Situation Awareness

Cited 0 time in scopus

Authors: Choulsoo Jang, Yoon-Seok Choi, Kwang-Yong Kim, Jaehwan Kim, Sungwoo Jun, Changeun Lee

Citation: International Conference on Big Data and Smart Computing (BigComp) 2025, pp.421-423

Abstract: Multimodal Visual Question Answering (VQA) has emerged as a critical technology for integrating image data and text-based queries to enhance situational awareness in complex environments. By leveraging the capabilities of Large Language Models (LLMs), VQA systems can process external knowledge and provide accurate, context-aware responses. This study explores the development of Korean-supporting multimodal VQA models, focusing on the architectural integration of vision encoders, projection layers, and LLMs. Through pre-training and fine-tuning, the research aims to evaluate these models for tasks requiring language adaptation and external knowledge handling in military scenarios.

KSP Keywords: Architectural integration, Battlefield situation, Context aware, External knowledge, Fine-tuning, Image data, Pre-Training, Situation awareness(SA), Situational Awareness, Visual Question Answering, complex environment

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.