Dictionary learning based sparse coefficients for speech recognition in noisy environment

Author

R S Ramitha;M Baburaj;Sudhish N George

Author_Institution

Electronics and Communication Engineering, National Institute of Technology, Calicut, Kearla, India

fYear

2015

Firstpage

151

Lastpage

156

Abstract

Automatic recognition of speech is an active area of research which provides a smooth platform for human-machine interaction. Complexity and recognition accuracy mainly depend on the selection of signal features and classifier. Commonly used features in the field of speech recognition are mel-frequency cepstral coefficients (MFCCs), line spectral frequencies (LSF), short time energy (STE) and linear prediction coefficients (LPC). In this paper, instead of using these well-established features, sparse feature derived from the dictionary of signal atoms using sparse coding is used for feature extraction. To improve the performance, artificial neural network (ANN) is used for the classification of isolated speech. Experimental results show that the proposed method works better in noisy environment upto 20dB SNR without using any speech enhancement method. To remove heavy background noise, a sparsity based speech enhancement algorithm is also proposed in the preprocessing stage of speech recognition. The proposed algorithm is compared with other popular speech recognition methods and it is observed that the proposed method can achieve better performance than the others.

Keywords

"Dictionaries","Speech","Speech recognition","Feature extraction","Speech enhancement","Training","Matching pursuit algorithms"

Publisher

ieee

Conference_Titel

Intelligent Computational Systems (RAICS), 2015 IEEE Recent Advances in

Type

conf

DOI

10.1109/RAICS.2015.7488405

Filename

7488405