Title :
Unsupervised speaker tracking in a speech recognition module for multi-party human-computer dialogue
Author :
Popescu, Vladimir ; Burileanu, Corneliu ; Caelen, Jean
Author_Institution :
Lab. d´Inf. de Grenoble, Grenoble Inst. of Technol., Grenoble, France
Abstract :
Multi-party spoken dialogue systems are yet to be deployed in real applications, since several issues need to be mitigated, e.g. spontaneous speech, reliable voice activity detection (because of user barge-in), and real-time operation. For dialogues between the computer and several users at the same time, speaker tracking is needed, in order to ensure an appropriate analysis of the input speech. This paper addresses precisely this issue: starting from a speaker independent speech recognizer, we clone and adapt this system to each new input utterance, via unsupervised MLLR (Maximum Likelihood Linear Regression). Then, by taking into account the recognition confidence scores obtained by the speaker independent and speaker adapted recognizers, for each utterance, we retain a number of adapted systems, that model the speakers. Unlike in speaker tracking of “offline” multimedia content, in multi-party dialogue the data are not priorly available and the number and features of the speakers are not priorly known; moreover, runtime constraints apply, for ergonomic reasons. The proposed speaker tracking procedure is evaluated in the context of a book reservation service-oriented application, in Romanian language.
Keywords :
human computer interaction; maximum likelihood detection; natural language interfaces; regression analysis; speaker recognition; Romanian language; book reservation service-oriented application; maximum likelihood linear regression; multiparty human-computer dialogue; multiparty spoken dialogue systems; offline multimedia content; recognition confidence scores; speaker independent speech recognizer; speaker tracking procedure; speech recognition module; unsupervised MLLR; unsupervised speaker tracking; voice activity detection; Acoustics; Decision trees; Hidden Markov models; Signal processing algorithms; Speech; Speech recognition; Training;
Conference_Titel :
Signal Processing Conference, 2008 16th European
Conference_Location :
Lausanne