A real-time text-independent speaker identification system

Author

Cordella, L.P. ; Foggia, P. ; Sansone, C. ; Vento, M.

Author_Institution

Dipt. di Informatica e Sistemistica, Univ. Federico II, Napoli, Italy

fYear

2003

fDate

17-19 Sept. 2003

Firstpage

632

Lastpage

637

Abstract

The paper presents a real-time speaker identification system based on the analysis of the audio track of a video stream. The system has been employed in the context of automatic video segmentation. It uses features evaluated in both the time and frequency domains. Their combined use significantly improve the performance of the system. Experiments have been carried on a database extracted from over one hour of television news, including 10 speakers. The obtained results confirm the effectiveness of the approach, showing an error rate less then 1% when the time interval used for identifying a speaker is about 1.5 seconds.

Keywords

audio signal processing; feature extraction; image segmentation; speaker recognition; time-frequency analysis; video signal processing; automatic video segmentation; feature extraction; frequency domain; real-time speaker identification; television news; text-independent speaker identification; time domain; video stream audio track; Cepstral analysis; Content based retrieval; Error analysis; Frequency; Image segmentation; Indexing; Real time systems; Spatial databases; Streaming media; TV;

fLanguage

English

Publisher

ieee

Conference_Titel

Image Analysis and Processing, 2003.Proceedings. 12th International Conference on

Print_ISBN

0-7695-1948-2

Type

conf

DOI

10.1109/ICIAP.2003.1234121

Filename

1234121