ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper Representation Learning for Background Music Identification in Television Shows
Cited 0 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Hyemi Kim, Junghyun Kim, Jihyun Park, Wonyoung Yoo
Issue Date
2019-10
Citation
International Conference on Information and Communication Technology Convergence (ICTC) 2019, pp.1434-1437
Publisher
IEEE
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1109/ICTC46691.2019.8939934
Abstract
Although audio fingerprinting has been widely used in various applications, the performances of audio fingerprinting methods are extremely decreased in case of identifying the background music mixed with speech in TV shows. To solve this, we present an approach to represent embeddings for background music identification using deep convolutional networks. We construct triplet dataset including the original songs, the same songs mixed with voices, and different songs. Then, we train the network with triplet loss function with adaptive margin. By nearest neighbor classifier, the closest embedding is found among the ones of original songs. As comparing top-1 accuracy of music identification, it is shown that our representation learning of the embedding from each music segment mixed with speech has meaningful information for music identification.