ETRI Knowledge Sharing Platform : Speech Emotion Recognition Using 2D-CNN with Mel-Frequency Cepstrum Coefficients

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Journal Article Speech Emotion Recognition Using 2D-CNN with Mel-Frequency Cepstrum Coefficients

Cited 11 time in scopus

Download 239 time Share share

Citation: Journal of Information and Communication Convergence Engineering, v.19, no.3, pp.148-154

Abstract: With the advent of context-aware computing, many attempts were made to understand emotions. Among these various attempts, Speech Emotion Recognition (SER) is a method of recognizing the speaker's emotions through speech information. The SER is successful in selecting distinctive 'features' and 'classifying' them in an appropriate way. In this paper, the performances of SER using neural network models (e.g., fully connected network (FCN), convolutional neural network (CNN)) with Mel-Frequency Cepstral Coefficients (MFCC) are examined in terms of the accuracy and distribution of emotion recognition. For Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset, by tuning model parameters, a two-dimensional Convolutional Neural Network (2D-CNN) model with MFCC showed the best performance with an average accuracy of 88.54% for 5 emotions, anger, happiness, calm, fear, and sadness, of men and women. In addition, by examining the distribution of emotion recognition accuracies for neural network models, the 2D-CNN with MFCC can expect an overall accuracy of 75% or more

KSP Keywords: Audio-visual, Best performance, Convolution neural network(CNN), Fully Connected Network, Mel-frequency Cepstral Coefficient(MFCC), Mel-frequency cepstrum, Model parameter, Neural network model, Speech Emotion recognition, Speech information, context-Aware computing

This work is distributed under the term of Creative Commons License (CCL)
(CC BY NC)

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.