ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper Development of Recognition System Using Fusion of Natural Gesture/Speech
Cited 3 time in scopus Download 0 time Share share facebook twitter linkedin kakaostory
Authors
Young-Giu Jung, Mun-Sung Han, Jun Seok Park, Sang Jo Lee
Issue Date
2008-01
Citation
International Conference on Consumer Electronics (ICCE) 2008, pp.1-2
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1109/ICCE.2008.4588016
Project Code
07MH1900, Development of Wearable Personal Station, Han Dong Won
Abstract
A multimodal interface can achieve more natural and effective human-computer interaction. In this paper, we present an isolated-word recognizer using a fusion of speech and natural visual gestures. The fusion of audio and visual signals can be carried out either at the class level or the feature level. Our system incorporates a fusion system at the feature level which supports 10 natural gestures. One of most difficult problems in feature level fusion is synchronization between audio and visual features. To solve this problem, we propose a modified Time Delay Neural Network (TDNN) architecture with a dedicated fusion layer and optimize parameters of this recognition model. Experimental results show that this system yields a performance improvement when compared to the performance of Automatic Speech Recognition (ASR) under various Signal-to-Noise Rate (SNR) conditions. ©2008 IEEE.
KSP Keywords
Fusion layer, Multimodal interface, Recognition System, Recognition model, Signal-to-Noise, Time delay neural network, Visual features, Visual signals, automatic speech recognition(ASR), class level, feature level fusion