ETRI Knowledge Sharing Platform : Non-Speech Section Detection on Media Contents

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper Non-Speech Section Detection on Media Contents

Cited - time in scopus

Citation: International Workshop on Advanced Image Technology (IWAIT) 2017, pp.1-2

Abstract: This paper addresses a problem of non-speech section detection for the DVS (Descriptive Video Service) authoring, whose goal is to discriminate the non-speech section where an audio description can be inserted in the media contents which involve the presence of various sounds. The proposed method is based on the Deep Neural Network (DNN) trained with the audio features extracted from the center channel signal of a full-mix stereo audio. Jointly exploiting the inter-channels structure of the broadcast audio and speech signal characteristics, it provides superior performance on the error rate and the convergence speed compared with the conventional method. Experiments on real broadcast audio confirm the high performance of the proposed method.

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.