Dynamic Speech Emotion Recognition with ...
Type de document :
Communication dans un congrès avec actes
Titre :
Dynamic Speech Emotion Recognition with State-Space Models
Auteur(s) :
Markov, Konstantin [Auteur]
Matsui, Tomoko [Auteur]
Septier, Francois [Auteur]
Institut TELECOM/TELECOM Lille1
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Peters, Gareth W. [Auteur]
University College of London [London] [UCL]
Matsui, Tomoko [Auteur]
Septier, Francois [Auteur]
Institut TELECOM/TELECOM Lille1
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Peters, Gareth W. [Auteur]
University College of London [London] [UCL]
Titre de la manifestation scientifique :
23rd European Signal Processing Conference (EUSIPCO)
Ville :
Nice
Pays :
France
Date de début de la manifestation scientifique :
2015-08-31
Date de publication :
2015-08-31
Discipline(s) HAL :
Sciences de l'ingénieur [physics]/Traitement du signal et de l'image [eess.SP]
Résumé en anglais : [en]
Automatic emotion recognition from speech has been focused mainly on identifying categorical or static affect states, but the spectrum of human emotion is continuous and time-varying. In this paper, we present a recognition ...
Lire la suite >Automatic emotion recognition from speech has been focused mainly on identifying categorical or static affect states, but the spectrum of human emotion is continuous and time-varying. In this paper, we present a recognition system for dynamic speech emotion based on state-space models (SSMs). The prediction of the unknown emotion trajectory in the affect space spanned by Arousal, Valence, and Dominance (A-V-D) descriptors is cast as a time series filtering task. The state- space models we investigated include a standard linear model (Kalman filter) as well as novel non-linear, non-parametric Gaussian Processes (GP) based SSM. We use the AVEC 2014 database for evaluation, which provides ground truth A-V-D labels which allows state and measurement functions to be learned separately simplifying the model training. For the filtering with GP SSM, we used two approximation methods: a recently proposed analytic method and Particle filter. All models were evaluated in terms of average Pearson correla- tion R and root mean square error (RMSE). The results show that using the same feature vectors, the GP SSMs achieve twice higher correlation and twice smaller RMSE than a Kalman filter.Lire moins >
Lire la suite >Automatic emotion recognition from speech has been focused mainly on identifying categorical or static affect states, but the spectrum of human emotion is continuous and time-varying. In this paper, we present a recognition system for dynamic speech emotion based on state-space models (SSMs). The prediction of the unknown emotion trajectory in the affect space spanned by Arousal, Valence, and Dominance (A-V-D) descriptors is cast as a time series filtering task. The state- space models we investigated include a standard linear model (Kalman filter) as well as novel non-linear, non-parametric Gaussian Processes (GP) based SSM. We use the AVEC 2014 database for evaluation, which provides ground truth A-V-D labels which allows state and measurement functions to be learned separately simplifying the model training. For the filtering with GP SSM, we used two approximation methods: a recently proposed analytic method and Particle filter. All models were evaluated in terms of average Pearson correla- tion R and root mean square error (RMSE). The results show that using the same feature vectors, the GP SSMs achieve twice higher correlation and twice smaller RMSE than a Kalman filter.Lire moins >
Langue :
Anglais
Comité de lecture :
Oui
Audience :
Internationale
Vulgarisation :
Non
Collections :
Source :