ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술대회 Discrimination of Speech Activity and Impact Noise Using an Accelerometer and a Microphone in a Car Environment
Cited 0 time in scopus Download 1 time Share share facebook twitter linkedin kakaostory
저자
김선만, 김홍국, 이성주, 이윤근
발행일
201112
출처
International Conference on Future Generation Communication and Networking (FGCN) 2011 (CCIS 266), v.266, pp.104-113
DOI
https://dx.doi.org/10.1007/978-3-642-27201-1_13
협약과제
11MS2600, 모바일 플랫폼 기반 대화모델 적용 자연어 음성 인터페이스 기술개발, 이윤근
초록
In this paper, we propose an algorithm to discriminate speech from vehicle body impact noise in a car. Depending on road conditions such as the presence of large bumps or unpaved stretches, impact noises from the car body may interfere with the detection of voice commands for a speech-enabled service in the car, which results in degraded service performance. The proposed algorithm classifies each analysis frame of the input signal recorded by a microphone into four different categories such as speech, impact noise, background noise, and mixed speech and impact noise. The classification is based on the likelihood ratio test (LRT) using statistical models constructed by combining signals obtained from the microphone with those from an accelerometer. In other words, the different characteristics detected by both acoustical and mechanical sensing enable better discrimination of voice commands from noise emanating from the vehicle body. The performance of the proposed algorithm is evaluated using a corpus of speech recordings in a car moving at an average velocity of 30-50 km/h with impact noise at various signal-to-noise ratios (SNRs) from -3 to 1 dB, where the SNR is defined as the ratio of the power of speech signals to that of impact noise. It is shown from the experiments that the proposed algorithm achieves a discrimination accuracy of 85%. © 2011 Springer-Verlag.
KSP 제안 키워드
Background noise, Car body, Degraded service, Impact noise, Input signal, Road condition, Service performance, Signal-to-Noise, Speech Signals, Speech recordings, Statistical Model