ETRI-Knowledge Sharing Plaform



논문 검색
구분 SCI
연도 ~ 키워드


학술대회 Automatic Detection of Malicious Sound Using Segmental Two-Dimensional Mel-Frequency Cepstral Coefficients and Histograms of Oriented Gradients
Cited 4 time in scopus Download 0 time Share share facebook twitter linkedin kakaostory
김명종, 김영관, 임재덕, 김회린
International Conference on Multimedia (MM) 2010, pp.887-890
10MS2100, 유해 멀티미디어 콘텐츠 분석/차단 기술개발, 정병호
This paper addresses the problem of recognizing malicious sounds, such as sexual scream or moan, to detect and block the objectionable multimedia contents. The malicious sounds show the distinct characteristics that have large temporal variations and fast spectral transitions. Therefore, extracting appropriate features to properly represent these characteristics is important in achieving a better performance. In this paper, we employ segment-based two-dimensional Mel-frequency cepstral coefficients and histograms of gradient directions as a feature set to characterize both the temporal variations and spectral transitions within a long-range segment of the target signal. Gaussian mixture model (GMM) is adopted to statistically represent the malicious and non-malicious sounds, and the test sounds are classified by a maximum a posterior probability (MAP) method. Evaluation of the proposed feature extraction method on a database of several hundred malicious and non-malicious sound clips yielded precision of 91.31% and recall of 94.27%. This result suggests that this approach could be used as an alternative to the image-based methods. © 2010 ACM.
KSP 제안 키워드
Automatic Detection, Feature set, Frequency cepstral coefficients, Gaussian mixture Model(GMM), Gradient direction, Histograms of Oriented Gradients(HOG), Image-based method, Long-range, Maximum a Posterior(MAP), Mel-Frequency Cepstrum Coefficients(MFCC), Mel-frequency cepstral