Title :
Social signal classification using deep blstm recurrent neural networks
Author :
Brueckner, Raymond ; Schulter, Bjorn
Author_Institution :
Machine Intell. & Signal Process. Group, Tech. Univ. Munchen, München, Germany
Abstract :
Non-verbal speech cues play an important role in human communication such as expressing emotional states or maintaining the conversational flow. In this paper we investigate the effect of applying deep bidirectional Long Short-Term Memory (BLSTM) recurrent neural networks to the Interspeech 2013 Computational Paralinguistics Social Signals Sub-Challenge dataset requiring frame-wise, speaker-independent detection and classification of laughter and filler vocalizations in speech. BLSTM networks tend to prevail over conventional neural network architectures whenever the recognition or regression task relies on an intelligent exploitation of temporal context information. We introduce deep BLSTM models by stacking several BLSTMs and by combining non-recurrent deep neural networks with BLSTMs. We demonstrate that this new approach achieves significant improvements over previous attempts and we increase the current state-of-the-art unweighted average area-under-the-curve (UAAUC) value of 92.4% to 94.0%. This is the best result on this task reported in the literature so far.
Keywords :
recurrent neural nets; signal classification; speech processing; bidirectional long short term memory; deep BLSTM recurrent neural networks; social signal classification; temporal context information; Context; Context modeling; Hidden Markov models; Recurrent neural networks; Speech; Training; Long Short-Term Memory; deep BLSTM; paralinguistics; recurrent neural networks; social signal classification;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6854518