Stream-based speaker segmentation using speaker factors and eigenvoices

Author

Castaldo, Fabio ; Colibro, Daniele ; Dalmasso, Emanuele ; Laface, Pietro ; Vair, Claudio

Author_Institution

Politec. di Torino, Turin

fYear

2008

fDate

March 31 2008-April 4 2008

Firstpage

4133

Lastpage

4136

Abstract

This paper presents a stream-based approach for unsupervised multi-speaker conversational speech segmentation. The main idea of this work is to exploit prior knowledge about the speaker space to find a low dimensional vector of speaker factors that summarize the salient speaker characteristics. This new approach produces segmentation error rates that are better than the state of the art ones reported in our previous work on the segmentation task in the NIST 2000 Speaker Recognition Evaluation (SRE). We also show how the performance of a speaker recognition system in the core test of the 2006 NIST SRE is affected, comparing the results obtained using single speaker and automatically segmented test data.

Keywords

eigenvalues and eigenfunctions; speech processing; speech recognition; conversational speech segmentation; eigenvoices; multispeaker speech segmentation; segmentation error rates; speaker factors; speaker recognition system; stream-based speaker segmentation; unsupervised speech segmentation; Automatic testing; Delay; Error analysis; NIST; Performance analysis; Signal analysis; Speaker recognition; Speech; Streaming media; System testing; Speaker modeling; eigenvoices; speaker clustering; speaker factors; speaker segmentation;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on

Conference_Location

Las Vegas, NV

ISSN

1520-6149

Print_ISBN

978-1-4244-1483-3

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2008.4518564

Filename

4518564