ETRI Knowledge Sharing Platform : Domain-Slot Relationship Modeling Using a Pre-Trained Language Encoder for Multi-Domain Dialogue State Tracking

BROWSE

Titles

논문 검색
Type		SCI
Year	~	Keyword

Detail

List

Journal Article Domain-Slot Relationship Modeling Using a Pre-Trained Language Encoder for Multi-Domain Dialogue State Tracking

Cited 2 time in scopus

Authors: Jinwon An, Sungzoon Cho, Junseong Bang, Misuk Kim

Issue Date: 2022-06

Citation: IEEE/ACM Transactions on Audio, Speech, and Language Processing, v.30, pp.2091-2102

ISSN: 2329-9290

Publisher: ACM, IEEE

Language: English

Type: Journal Article

DOI: https://dx.doi.org/10.1109/TASLP.2022.3181350

Abstract: Dialogue state tracking for multi-domain dialogues is challenging because the model should be able to track dialogue states across multiple domains and slots. As using pre-trained language models is the de facto standard for natural language processing tasks, many recent studies use them to encode the dialogue context for predicting the dialogue states. Model architectures that have certain inductive biases for modeling the relationship among different domain-slot pairs are also emerging. Our work is based on these research approaches on multi-domain dialogue state tracking. We propose a model architecture that effectively models the relationship among domain-slot pairs using a pre-trained language encoder. Inspired by the way the special [CLS] token in BERT is used to aggregate the information of the whole sequence, we use multiple special tokens for each domain-slot pair that encodes information corresponding to its domain and slot. The special tokens are run together with the dialogue context through the pre-trained language encoder, which effectively models the relationship among different domain-slot pairs. Our experimental results on the datasets MultiWOZ-2.0 and MultiWOZ-2.1 show that our model outperforms other models with the same setting. Our ablation studies incorporate three main parts. The first component shows the effectiveness of our approach exploiting the relationship modeling. The second component compares the effect of using different pre-trained language encoders. The final component involves comparing different initialization methods that could be used for the special tokens. Qualitative analysis of the attention map of the pre-trained language encoder shows that our special tokens encode relevant information through the encoding process by attending to each other.

KSP Keywords: De facto standard, Initialization methods, Language Model, Model architecture, Multi-Domain, Multiple domains, Natural Language Processing(NLP), Qualitative Analysis, Relationship Modeling, State tracking, encoding process

ETRI-Knowledge Sharing Plaform

BROWSE

Titles

Detail

ETRI