مرکز منطقه ای اطلاع رساني علوم و فناوري - Speaker identification model for Assamese language using a neural framework

DocumentCode :

671658

Title :

Speaker identification model for Assamese language using a neural framework

Author :

Sarma, M. ; Sarma, Kandarpa Kumar

Author_Institution :

Dept. of Electron. & Electr. Eng., IIT Guwahati, Guwahati, India

fYear :

2013

fDate :

4-9 Aug. 2013

Firstpage :

Lastpage :

Abstract :

This paper presents a neural model of speaker identification using the vowel sound segmented out from words spoken by a speaker. Vowel sounds occur in a speech more frequently and with higher energy. Therefore, situations where acoustic information is noise corrupted vowel sounds can be used to extract different amounts of speaker discriminative information. The model explained here uses a neural framework formed with Probabilistic Neural Network (PNN) and Learning Vector Quantization (LVQ) where a novel Self Organizing Map (SOM) based vowel segmentation technique is used. The work extracts glottal source information of the speakers by Empirical-Mode Decomposition (EMD) of the speech signal and depending on which a LVQ based speaker code book is formed. The work shows the use of residual signal obtained from EMD of speech as a speaker discriminative feature. The neural approach of speaker identification gives superior performance in comparison to the conventional statistical approach like Hidden Markov Models (HMMs), Gaussian Mixture Models (GMMs) etc. found in literature. The work formulates a framework for the design of a ANN based speaker recognition model for Assamese language which is spoken by around three million people in the North East Indian state of Assam. Although the proposed model has been experimented in case of the speakers of Assamese language, it shall also be suitable for other Devanagari based languages for which the speaker database should contain samples of that specific language.

Keywords :

audio databases; information retrieval; learning (artificial intelligence); natural language processing; probability; self-organising feature maps; speaker recognition; ANN based speaker recognition model; Assamese language; Devanagari based languages; EMD; LVQ; LVQ based speaker code book; North East Indian state; PNN; SOM-based vowel segmentation technique; acoustic information; empirical-mode decomposition; glottal source information extraction; learning vector quantization; neural framework; neural model; noise corrupted vowel sounds; probabilistic neural network; residual signal; self organizing map based vowel segmentation technique; speaker database; speaker discriminative feature; speaker discriminative information; speaker identification model; speech signal; Artificial neural networks; Databases; Hidden Markov models; Speaker recognition; Speech; Speech processing; Speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks (IJCNN), The 2013 International Joint Conference on

Conference_Location :

Dallas, TX

ISSN :

2161-4393

Print_ISBN :

978-1-4673-6128-6

Type :

conf

DOI :

10.1109/IJCNN.2013.6707000

Filename :

6707000

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=671658