Multi-feature stacking order impact on speech emotion recognition performance
Yoga Tanoko, Amalia Zahra
Abstract
One of the biggest challenges in implementing SER is to produce a model that performs well and is lightweight. One of the ways is using one-dimensional convolutional neural network (1D CNN) and combining some handcrafted features. 1D CNN is mostly used for time series data. In time series data, the order of information plays an important role. In this case, the order of stacked features also plays an important role. In this work, the impact of changing the order is analyzed. This work proposes to brute force all possible combinations of feature orders from five features: Mel-frequency cepstral coefficient (MFCC), Mel-spectrogram, chromagram, spectral contrast, and tonnetz, then uses 1D CNN as the model architecture and benchmarking the model's performance on the Ryerson audio-visual database of emotional speech and song (RAVDESS) dataset. The results show that changing the order of features can impact overall classification accuracy, specific emotion accuracy, and model size. The best model has an accuracy of 79.17% for classifying 8 emotion classes with the following order: spectral contrast, tonnetz, chromagram, Mel-spectrogram, and MFCC. Finding a suitable order can increase the accuracy up to 16.05% and reduce the model size up to 96%.
Keywords
Chromagram; CNN; Mel-spectrogram; MFCC; Spectral contrast; Speech emotion recognition; Tonnetz
DOI:
https://doi.org/10.11591/eei.v11i6.4287
Refbacks
There are currently no refbacks.
This work is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License .
<div class="statcounter"><a title="hit counter" href="http://statcounter.com/free-hit-counter/" target="_blank"><img class="statcounter" src="http://c.statcounter.com/10241695/0/5a758c6a/0/" alt="hit counter"></a></div>
Bulletin of EEI Stats
Bulletin of Electrical Engineering and Informatics (BEEI) ISSN: 2089-3191, e-ISSN: 2302-9285 This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU) .