مرکز منطقه ای اطلاع رساني علوم و فناوري - Using prosodic and lexical information for speaker identification

DocumentCode :

2851831

Title :

Using prosodic and lexical information for speaker identification

Author :

Weber, Frederick ; Manganaro, Linda ; Peskin, Barbara ; Shriberg, Elizabeth

Author_Institution :

Dragon Systems/Lernout and Hauspie, Newton, MA, USA

Volume :

fYear :

2002

fDate :

13-17 May 2002

Abstract :

We investigate the incorporation of larger time-scale information, such as prosody, into standard speaker ID systems. Our study is based on the Extended Data Task of the NIST 2001 Speaker ID evaluation, which provides much more test and training data than has traditionally been available to similar speaker ID investigations. In addition, we have had access to a detailed prosodic feature database of Switchboard-I conversations, including data not previously applied to speaker ID. We describe two baseline acoustic systems, an approach using Gaussian Mixture Models, and an LVCSR-based speaker ID system. These results are compared to and combined with two larger time-scale systems: a system based on an “idiolect” language model. and a system making use of the contents of the prosody database. We find that, with sufficient test and training data, suprasegmental information can significantly enhance the performance of traditional speaker ID systems.

Keywords :

Accuracy; Artificial intelligence; Computational modeling; Data models; NIST; Robustness; Switches;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on

Conference_Location :

Orlando, FL, USA

ISSN :

1520-6149

Print_ISBN :

0-7803-7402-9

Type :

conf

DOI :

10.1109/ICASSP.2002.5743674

Filename :

5743674

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2851831