ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술대회 How to Aggregate Acoustic Delta Features for Deep Speaker Embeddings
Cited 1 time in scopus Download 2 time Share share facebook twitter linkedin kakaostory
저자
김영삼, 노종혁, 조관태, 조상래
발행일
202010
출처
International Conference on Information and Communication Technology Convergence (ICTC) 2020, pp.1225-1229
DOI
https://dx.doi.org/10.1109/ICTC49870.2020.9289205
협약과제
20HR3100, 고신뢰 지능정보 서비스에서 휴먼(H)-인프라(I)-서비스(S)를 연결하는 Portal Device 보안 기술 개발, 조상래
초록
Speaker verification based on deep speaker embeddings (DSE) network outperformed traditional i- vectors systems. Afterward, to improve the performance, various researches have been conducting and data augmentation methods are one of them. In this paper, we focus on acoustic delta features augmentation and their aggregation methods for DSE networks, X-vectors and MobileVoxNet. For CNN-based MobileVoxNet, we re-design the architecture to aggregate delta features in deeper layer with squeeze and excitation (SE) module. Experimental results show that the proposed methods achieve performance improvement compared to not using delta features on the VoxCeleb1 test dataset. We also compare the number of computations and parameters of models to analyze efficiency of the proposed methods.
KSP 제안 키워드
Data Augmentation, Delta features, Re-design, Speaker Embeddings, Speaker verification, aggregation method, performance improvement