ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Journal Article Named entity recognition using transfer learning and small human‐ and meta‐pseudo‐labeled datasets
Cited 3 time in scopus Download 91 time Share share facebook twitter linkedin kakaostory
Authors
Kyoungman Bae, Joon-Ho Lim
Issue Date
2024-02
Citation
ETRI Journal, v.46, no.1, pp.59-70
ISSN
1225-6463
Publisher
한국전자통신연구원
Language
English
Type
Journal Article
DOI
https://dx.doi.org/10.4218/etrij.2023-0321
Abstract
We introduce a high‐performance named entity recognition (NER) model for written and spoken language. To overcome challenges related to labeled data scarcity and domain shifts, we use transfer learning to leverage our previously developed KorBERT as the base model. We also adopt a meta‐pseudo‐label method using a teacher/student framework with labeled and unlabeled data. Our model presents two modifications. First, the student model is updated with an average loss from both human‐ and pseudo‐labeled data. Second, the influence of noisy pseudo‐labeled data is mitigated by considering feedback scores and updating the teacher model only when below a threshold (0.0005). We achieve the target NER performance in the spoken language domain and improve that in the written language domain by proposing a straightforward rollback method that reverts to the best model based on scarce human‐labeled data. Further improvement is achieved by adjusting the label vector weights in the named entity dictionary.
KSP Keywords
BEST Model, Data scarcity, Named entity Recognition, Student model, Teacher Model, Transfer learning, label vector, model-based, spoken language, unlabeled data
This work is distributed under the term of Korea Open Government License (KOGL)
(Type 4: : Type 1 + Commercial Use Prohibition+Change Prohibition)
Type 4: