ETRI Knowledge Sharing Platform : Data Augmentation by Data Noising for Open-vocabulary Slots in Spoken Language Understanding

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper Data Augmentation by Data Noising for Open-vocabulary Slots in Spoken Language Understanding

Cited - time in scopus

Citation: Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) 2019: Student Research Workshop (SRW), pp.97-102

Abstract: One of the main challenges in Spoken Language Understanding (SLU) is dealing with 'open-vocabulary' slots. Recently, SLU models based on neural network were proposed, but it is still difficult to recognize the slots of unknown words or 'open-vocabulary' slots because of the high cost of creating a manually tagged SLU dataset. This paper proposes data noising, which reflects the characteristics of the 'open-vocabulary' slots, for data augmentation. We applied it to an attention based bi-directional recurrent neural network (Liu and Lane, 2016) and experimented with three datasets: Airline Travel Information System (ATIS), Snips, and MIT-Restaurant. We achieved performance improvements of up to 0.57% and 3.25 in intent prediction (accuracy) and slot filling (f1-score), respectively. Our method is advantageous because it does not require additional memory and it can be applied simultaneously with the training process of the model.

KSP Keywords: Data Augmentation, F1-score, Spoken language understanding, Travel information, Unknown words, bi-directional, information system, intent prediction, neural network(NN), recurrent neural network(RNN), slot filling

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.