ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper Evaluation of LLM-Based VQA Models for Enhancing Battlefield Situation Awareness
Cited 0 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Choulsoo Jang, Yoon-Seok Choi, Kwang-Yong Kim, Jaehwan Kim, Sungwoo Jun, Changeun Lee
Issue Date
2025-02
Citation
International Conference on Big Data and Smart Computing (BigComp) 2025, pp.421-423
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1109/BigComp64353.2025.00088
Abstract
Multimodal Visual Question Answering (VQA) has emerged as a critical technology for integrating image data and text-based queries to enhance situational awareness in complex environments. By leveraging the capabilities of Large Language Models (LLMs), VQA systems can process external knowledge and provide accurate, context-aware responses. This study explores the development of Korean-supporting multimodal VQA models, focusing on the architectural integration of vision encoders, projection layers, and LLMs. Through pre-training and fine-tuning, the research aims to evaluate these models for tasks requiring language adaptation and external knowledge handling in military scenarios.
KSP Keywords
Architectural integration, Battlefield situation, Context aware, External knowledge, Fine-tuning, Image data, Pre-Training, Situation awareness(SA), Situational Awareness, Visual Question Answering, complex environment