ETRI-Knowledge Sharing Plaform



논문 검색
구분 SCI
연도 ~ 키워드


학술대회 An Approach on a Combination of Higher-order Statistics and Higher-order Differential Energy Operator for Detecting Pathological Voice with Machine Learning
Cited 9 time in scopus Download 6 time Share share facebook twitter linkedin kakaostory
문지혜, 김상훈
International Conference on Information and Communication Technology Convergence (ICTC) 2018, pp.46-51
18ZS1100, 자율성장형 AI 핵심원천기술 연구, 이윤근
Voice signal is an indicator finding a progression of diseases such as nerve disorder and muscle dysfunction. To improve the performance of medical diagnosis system using the voice signal, this paper suggests a new feature extraction method which combines higher-order statistics (HOS) and higher-order differential energy operator (DEO). For the experiment, Saarbruecken Voice Database (SVD) was used, and 687 healthy voice samples and 263 pathological voice samples which consist of Cysts, Paralysis, and Polyp were selected. In addition, the OpenSmile script which provides 6,373 features was used for comparison with our new features. To decide the most effective features, Gradient Boosting was conducted as a feature selector. Finally, 20 features including 15 combinations of HOS and DEO were chosen, and deep neural network(DNN) was trained using the new features. The best accuracy of 87.4% was obtained, which exceeds the best accuracy of 84.5% with the existing features. The finding suggests a possibility that the pathological voice can be efficiently detected with only statistical information without heavy computations such as convolutional neural networks. Due to the simple structure, we expect this approach will be easily applied to a variety of mobile systems.
KSP 제안 키워드
Convolution neural network(CNN), Deep neural network(DNN), Diagnosis system, Energy operator, Medical diagnosis, Mobile system, Pathological voice, Statistical information, Voice signal, feature extraction method, gradient boosting