ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper How to Aggregate Acoustic Delta Features for Deep Speaker Embeddings
Cited 1 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Youngsam Kim, Jong-hyuk Roh, Kwantae Cho, Sangrae Cho
Issue Date
2020-10
Citation
International Conference on Information and Communication Technology Convergence (ICTC) 2020, pp.1225-1229
Publisher
IEEE
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1109/ICTC49870.2020.9289205
Abstract
Speaker verification based on deep speaker embeddings (DSE) network outperformed traditional i- vectors systems. Afterward, to improve the performance, various researches have been conducting and data augmentation methods are one of them. In this paper, we focus on acoustic delta features augmentation and their aggregation methods for DSE networks, X-vectors and MobileVoxNet. For CNN-based MobileVoxNet, we re-design the architecture to aggregate delta features in deeper layer with squeeze and excitation (SE) module. Experimental results show that the proposed methods achieve performance improvement compared to not using delta features on the VoxCeleb1 test dataset. We also compare the number of computations and parameters of models to analyze efficiency of the proposed methods.
KSP Keywords
Data Augmentation, Delta features, Re-design, Speaker Embeddings, Speaker verification, aggregation method, performance improvement