ETRI Knowledge Sharing Platform : Pretrained Network-based Sound Event Recognition for Audio Surveillance Applications

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper Pretrained Network-based Sound Event Recognition for Audio Surveillance Applications

Cited 3 time in scopus

Citation: International Conference on Information and Communication Technology Convergence (ICTC) 2021, pp.1306-1309

Abstract: Despite the recent surge in the demand on the audio recognition in surveillance systems, there are still many obstacles to readily use it in real environments such as legal restrictions on public data collection and difficulties in obtaining large-scale learning data. To overcome these problems, we propose an adaptive sound event recognition scheme based on a pre-trained network with the large-scale AudioSet, where PANNs-based CNN and SincNet[3] are employed to extract the audio features from log-mel spectrogram and waveform, respectively. Our experimental results show that the proposed method achieves mean average precision (mAP) of 0.415, which is slightly better than the best previous methods. Furthermore, we evaluate the performance of transfer leaning using a smaller amount of data collected by ourselves, considering dangerous situation scenarios in surveillance applications.

KSP Keywords: Audio Features, Audio recognition, Audio surveillance, Data Collection, Data collected, Large-Scale Learning, Learning data, Network-based, Public Data, Sound event recognition, Surveillance applications

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.