Automatic Detection of Laughter and Fillers in Spontaneous Mobile Phone Conversations

Author

Salamin, H. ; Polychroniou, Anna ; Vinciarelli, Alessandro

Author_Institution

Sch. of Comput. Sci., Univ. of Glasgow, Glasgow, UK

fYear

2013

fDate

13-16 Oct. 2013

Firstpage

4282

Lastpage

4287

Abstract

This article presents experiments on automatic detection of laughter and fillers, two of the most important nonverbal behavioral cues observed in spoken conversations. The proposed approach is fully automatic and segments audio recordings captured with mobile phones into four types of interval: laughter, filler, speech and silence. The segmentation methods rely not only on probabilistic sequential models (in particular Hidden Markov Models), but also on Statistical Language Models aimed at estimating the a-priori probability of observing a given sequence of the four classes above. The experiments are speaker independent and performed over a total of 8 hours and 25 minutes of data (120 people in total). The results show that F₁ scores up to 0.64 for laughter and 0.58 for fillers can be achieved.

Keywords

audio signal processing; hidden Markov models; mobile handsets; speech recognition; F₁ scores; audio recording segmentation; fillers detection; hidden Markov models; laughter detection; nonverbal behavioral cues; probabilistic sequential models; silence; speech; spoken conversations; spontaneous mobile phone conversations; statistical language models; Accuracy; Feature extraction; Hidden Markov models; Mathematical model; Mel frequency cepstral coefficient; Speech; Training; Fillers Detection; Hidden Markov Model; Laughter Detection; Nonverbal Vocal Behavior; Statistical Language Models;

fLanguage

English

Publisher

ieee

Conference_Titel

Systems, Man, and Cybernetics (SMC), 2013 IEEE International Conference on

Conference_Location

Manchester

Type

conf

DOI

10.1109/SMC.2013.730

Filename

6722483