مرکز منطقه ای اطلاع رساني علوم و فناوري - Text independent speaker recognition based on the attack state formants and neural network classification

DocumentCode :

1607571

Title :

Text independent speaker recognition based on the attack state formants and neural network classification

Author :

Seddik, Hassen ; Rahmouni, Amel B S ; Sayadi, Mounir

Author_Institution :

ESSTT, Tunis, Tunisia

Volume :

fYear :

2004

Firstpage :

1649

Abstract :

In this paper, a new method for text independent speaker recognition is proposed. Based essentially on formants frequencies position, the speaker is characterized by only the formants position of his first voiced speech frame, called the attack state. Fundamental frequency "pitch" is combined with these formants in order to study the effect of this assortment on the recognition rate. To validate our approach, two different methods are used for the attack state formants positions computing. The first method consists on checking the formants position in the power spectral domain using the YULE-WALKER\´s equations. The second method uses the frequency response of a numeric filter, corresponding to the vocal tract\´s transfer function. These methods are based on a high auto-regressive (AR) model order of the voice. A multi-layer neural network trained by the back-propagation algorithm is proposed for training and classifying the extracted data. Two classification methods are used: The serial classification and a new proposed method called the cascade classification. In each method, different networks structures are tested in order to carry out the finest results. Good recognition rates are obtained using this attack state approach. In all tests the found recognition rates are improved by the cascade classification.

Keywords :

autoregressive processes; backpropagation; frequency response; neural nets; speaker recognition; Yule-Walker equations; attack state; back-propagation algorithm; cascade classification; data classification; formants position; frequency response; high autoregressive model; multilayer neural network; neural network classification; numeric filter; power spectral domain; text independent speaker recognition; transfer function; Data mining; Equations; Filters; Frequency response; Multi-layer neural network; Neural networks; Speaker recognition; Speech; Testing; Transfer functions;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Industrial Technology, 2004. IEEE ICIT '04. 2004 IEEE International Conference on

Print_ISBN :

0-7803-8662-0

Type :

conf

DOI :

10.1109/ICIT.2004.1490815

Filename :

1490815

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1607571