• DocumentCode
    573553
  • Title

    A feature extraction method for speech recognition based on temporal tracking of clusters in spectro-temporal domain

  • Author

    Esfandian, Nafiseh ; Razzazi, Farbod ; Behrad, Alireza

  • Author_Institution
    Dept. of Electr. Eng., Islamic Azad Univ., Qaemshahr, Iran
  • fYear
    2012
  • fDate
    2-3 May 2012
  • Abstract
    In this paper, a novel approach is proposed for secondary feature extraction based on clusters tracking in spectro-temporal domain. Because of high dimensionality of the spectro-temporal features space, this domain is unsuitable for practical speech recognition systems. In order to reduce the dimensions of the feature space, weighted K-means (WKM) clustering technique is applied to spectro-temporal domain. The elements of mean vectors and covariance matrices of clusters are considered as the feature vector of each frame. However the cluster locations change gradually over the time. The main approach is based on the idea that the variations in clusters locations should be temporally tracked frame by frame and the parameters of these variations are considered in the extraction of secondary feature vectors of each speech frame. Several models are used to register the clusters in the new coming frame. In addition, a new architecture is proposed to classify the speech frames by a combining classifier using both tracked and non-tracked secondary features. The assessments were conducted for the proposed feature vectors on classification of several subsets of TIMIT database phonemes. Using tracked secondary feature vectors, the result was improved to 77.4% on voiced plosives classification which was relatively 1.8% higher than the results of non-tracked secondary feature vectors. The results on other subsets showed good improvement in classification rate too.
  • Keywords
    covariance matrices; feature extraction; pattern clustering; set theory; signal classification; speech recognition; vectors; TIMIT database phonemes; WKM clustering technique; covariance matrices; dimension reduction; mean vector; secondary feature vector extraction; spectrotemporal feature space; speech classification; speech recognition; subsets; temporal cluster tracking; voiced plosives classification; weighted K-means clustering technique; Feature extraction; Filter banks; Sorting; Spectrogram; Speech; Support vector machine classification; Vectors; Auditory system; Clustering methods; Feature extraction; Image matching; Speech processing; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Artificial Intelligence and Signal Processing (AISP), 2012 16th CSI International Symposium on
  • Conference_Location
    Shiraz, Fars
  • Print_ISBN
    978-1-4673-1478-7
  • Type

    conf

  • DOI
    10.1109/AISP.2012.6313709
  • Filename
    6313709