ETRI Knowledge Sharing Platform : Convolutional Recurrent Neural Networks for Urban Sound Classification using Raw Waveforms

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper Convolutional Recurrent Neural Networks for Urban Sound Classification using Raw Waveforms

Cited 68 time in scopus

Citation: European Signal Processing Conference (EUSIPCO) 2018, pp.2444-2448

Abstract: Recent studies have demonstrated deep learning approaches directly from raw data have been successfully used in image and text. This approach has been applied to audio signals as well but not fully explored yet. In this works, we propose a convolutional recurrent neural network that directly uses time-domain waveforms as input in the domain of urban sound classification. Convolutional recurrent neural network is combined model of convolutional neural networks for extracting sound features and recurrent neural networks for temporal aggregation of the extracted features. The method was evaluated using the UrbanSound8k dataset, the largest public dataset of urban environmental sound sources available for research. The results show how convolutional recurrent neural network with raw waveforms improve the accuracy in urban sound classification and provide effectiveness of its structure with respect to the number of parameters.

KSP Keywords: Audio signal, Convolution neural network(CNN), Environmental Sound, Learning approach, Public Datasets, Raw Data, Sound features, Sound source, Temporal aggregation, Urban sound classification, combined model

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.