ETRI Knowledge Sharing Platform : Sound Event Localization and Detection using Spatial Feature Fusion

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper Sound Event Localization and Detection using Spatial Feature Fusion

Cited 2 time in scopus

Citation: International Conference on Information and Communication Technology Convergence (ICTC) 2022, pp.1849-1851

Abstract: Sound event localization and detection (SELD) can identify the category and location of a sound event along with providing valuable information for many applications. Existing methods primarily use convolutional recurrent neural networks as a network model and Log-Mel spectrograms to classify sound events. However, there are no dominant spatial features to identify the direction of sound events. The fusion of various spatial features can lead to a better performance in the SELD task. In this study, we propose an optimal feature fusion by systematically analyzing various combinations of spatial features. We used the TAU-NIGENS spatial sound events 2021 dataset to evaluate the SELD task performance of various combinations of spatial features. We found that the combination of interaural phase difference (IPD) and sinIPD had a better performance than the other features and combinations. Finally, we confirmed that the proposed features had a better performance than the state-of-the-art methods.

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.