مرکز منطقه ای اطلاع رساني علوم و فناوري - RNN and SOM based classifier to recognize assamese fricative sounds designed using frame based temporal feature sets

DocumentCode :

1798121

Title :

RNN and SOM based classifier to recognize assamese fricative sounds designed using frame based temporal feature sets

Author :

Patgiri, Chayashree ; Sarma, M. ; Sarma, Kandarpa Kumar

Author_Institution :

Dept. of Electron. Commun. Eng., Gauhati Univ., Guwahati, India

fYear :

2014

fDate :

6-11 July 2014

Firstpage :

3496

Lastpage :

3502

Abstract :

In this work, a Recurrent Neural Network (RNN) is trained using cepstral features and a set of difference cepstral feature (DCF) vectors on a frame to frame basis. The DCF vector is formulated to capture the temporal patterns of fricative sounds or phonemes of Assamese language. A hybrid algorithm is developed to recognize these fricative phonemes from certain words containing them. To preserve the temporal information of the speech segment, we here consider a frame-based hybrid approach to recognize fricatives from Assamese speech. A hybrid feature set is developed where simple frame-based feature is combined with differential frame-based feature. Investigation of feature extraction techniques like Linear Predictive Cepstral Coefficient (LPCC) and Mel-Frequency Cepstral Coefficient (MFCC) have been carried out and their performances have been evaluated in comparison to that obtained from the DCF set for Assamese fricative phoneme recognition. Here, speech segment is divided into 20 millisecond frames with overlap of 10 millisecond to extract features. Also, difference of a current frame with its preceding and succeeding frame is considered for forming a more accurate dynamic approach for fricative recognition. The differential processing enables to reduce correlation and retain only the most relevant portion of the input. After obtaining the feature vectors, Self Organizing Map (SOM) has been used to categorize the related features into different classes and remove repeating data. Thus the features obtained from the phoneme signal has been reduced into different sized cluster centres provided by SOM. The reduced feature vector is next applied to the RNN based hybrid classifier for learning the pattern and recognizing any unknown fricative segment.

Keywords :

cepstral analysis; feature extraction; pattern classification; recurrent neural nets; self-organising feature maps; speech recognition; DCF set; DCF vectors; LPCC; MFCC; RNN; SOM; assamese fricative phoneme recognition; assamese language; correlation; difference cepstral feature vectors; feature extraction techniques; frame-based hybrid approach; hybrid classifier; hybrid feature set; linear predictive cepstral coefficient; mel-frequency cepstral coefficient; phoneme signal; preceding frame; recurrent neural network; self organizing map; speech segment; succeeding frame; Cepstrum; Feature extraction; Mel frequency cepstral coefficient; Speech; Speech recognition; Vectors;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks (IJCNN), 2014 International Joint Conference on

Conference_Location :

Beijing

Print_ISBN :

978-1-4799-6627-1

Type :

conf

DOI :

10.1109/IJCNN.2014.6889792

Filename :

6889792

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1798121