Title :
Speaker identification in noisy environment using bispectrum analysis and probabilistic neural network
Author :
Kusumoputro, Benyamin ; Triyanto, Adi ; Fanany, M. Ivan ; Jatmiko, Wisnu
Author_Institution :
Fac. of Comput. Sci., Univ. of Indonesia, Indonesia
Abstract :
The paper describes the application of a neural processing for extracting bispectrum feature of speech data, and the use of probabilistic neural network as a classifier in an automatic speech recognition system. The usually used feature extraction paradigm in the early development of the speech recognition system is power spectrum analysis, however, the recognition rate of this system is not high enough, especially when a Gaussian noise is added to the utterance speech data. In this paper, we developed a speaker identification system using bispectrum feature analysis. To analyse the distribution of the bispectrum data along its two dimensional representation, we developed an adaptive feature extraction mechanism of the bispectrum speech data based on cascade neural network. A cascade configuration of SOFM (Self-Organizing Feature Map) and LVQ (Learning Vector Quantization) is used as an adaptive codebook generation algorithm for determining the feature distribution of the bispectrum speech data. The K-L transformation (K-LT) technique is then used as a preprocessing element before the neural classifier is utilized. This K-LT has shown as an effective procedure for orthogonalization and dimensionality reduction of the codebook vectors generated from bispectrum data. Experimental results show that our system could perform with high recognition rate on the undirected utterance speech, especially when a higher number of codebook vectors are utilized. It is also shown that the use of PNN could increase the recognition rate significantly, even using speech data with additional Gaussian noise
Keywords :
Gaussian noise; learning (artificial intelligence); neural nets; speech recognition; vector quantisation; Gaussian noise; K-L transformation; adaptive codebook generation algorithm; additional Gaussian noise; automatic speech recognition system; bispectrum analysis; codebook vectors; dimensionality reduction; feature distribution; learning vector quantization; noisy environment; orthogonalization; power spectrum analysis; preprocessing element; probabilistic neural network; speaker identification; Automatic speech recognition; Data mining; Feature extraction; Gaussian noise; Neural networks; Speech analysis; Speech enhancement; Speech processing; Speech recognition; Working environment noise;
Conference_Titel :
Computational Intelligence and Multimedia Applications, 2001. ICCIMA 2001. Proceedings. Fourth International Conference on
Conference_Location :
Yokusika City
Print_ISBN :
0-7695-1312-3
DOI :
10.1109/ICCIMA.2001.970480