ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article Exploring zero-shot essay scoring: from feature-based to LLM-based approaches
Cited 0 time in scopus Download 24 time Share share facebook twitter linkedin kakaostory
Authors
Hongseok Choi, Myeong-Cheol Kang, Jin Seong, Jin-Xia Huang
Issue Date
2026-04
Citation
Data Mining and Knowledge Discovery, v.40, no.3, pp.1-32
ISSN
1384-5810
Publisher
Springer Nature
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.1007/s10618-026-01193-z
Abstract
Automated Essay Scoring (AES) plays a crucial role in educational assessment, yet building reliable systems often requires large amounts of prompt- and trait-specific labeled data. As a result, many studies rely heavily on a single public dataset, ASAP++, which limits their generalizability. In this paper, we explore diverse zero-shot AES approaches that are both prompt-agnostic and multi-trait, thereby reducing reliance on labeled data. Our study examines large language models (LLMs), BERT-like fine-tuned models, and simple feature-based methods. Furthermore, we propose two novel zero-shot frameworks: CAnUSe (Comparative Assessment and subsequent Uncertainty-aware Self-training) and EASY (Essay Assessment with Simple Yardsticks). CAnUSe combines LLM-based pairwise comparisons with uncertainty-aware self-training using a BERT-like single-essay scoring model, while EASY employs only three intuitive features-essay length, vocabulary diversity, and grammar quality. Evaluation on three benchmark datasets-ASAP++, Ellipse, and TOEFL11-shows that our frameworks achieve state-of-the-art performance, demonstrating strong generalizability across datasets. Remarkably, despite operating in a zero-shot setting, our models approach the performance of fully supervised models trained with approximately 10K labeled samples. Further analysis provides practical insights for building real-world AES systems.
Keyword
Automated essay scoring, Feature-based methods, Large language models, Low-resource settings, Uncertainty-aware self-training, Zero-shot learning
KSP Keywords
Art performance, Automated Essay Scoring(AES), Benchmark datasets, Comparative assessment, Educational assessment, Feature-based methods, Labeled samples, Low-resource settings, Public Datasets, Real-world, Reliable system
This work is distributed under the term of Creative Commons License (CCL)
(CC BY ND)
CC BY ND