• DocumentCode
    1087926
  • Title

    A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition

  • Author

    Atal, Bishnu S. ; Rabiner, Lawrence R.

  • Author_Institution
    Bell Laboratories, Murray Hill, NJ
  • Volume
    24
  • Issue
    3
  • fYear
    1976
  • fDate
    6/1/1976 12:00:00 AM
  • Firstpage
    201
  • Lastpage
    212
  • Abstract
    In speech analysis, the voiced-unvoiced decision is usually performed in conjunction with pitch analysis. The linking of voiced-unvoiced (V-UV) decision to pitch analysis not only results in unnecessary complexity, but makes it difficult to classify short speech segments which are less than a few pitch periods in duration. In this paper, we describe a pattern recognition approach for deciding whether a given segment of a speech signal should be classified as voiced speech, unvoiced speech, or silence, based on measurements made on the signal. In this method, five different measurements are made on the speech segment to be classified. The measured parameters are the zero-crossing rate, the speech energy, the correlation between adjacent speech samples, the first predictor coefficient from a 12-pole linear predictive coding (LPC) analysis, and the energy in the prediction error. The speech segment is assigned to a particular class based on a minimum-distance rule obtained under the assumption that the measured parameters are distributed according to the multidimensional Gaussian probability density function. The means and covariances for the Gaussian distribution are determined from manually classified speech data included in a training set. The method has been found to provide reliable classification with speech segments as short as 10 ms and has been used for both speech analysis-synthesis and recognition applications. A simple nonlinear smoothing algorithm is described to provide a smooth 3-level contour of an utterance for use in speech recognition applications. Quantitative results and several examples illustrating the performance of the method are included in the paper.
  • Keywords
    Density measurement; Energy measurement; Joining processes; Linear predictive coding; Particle measurements; Pattern recognition; Performance analysis; Speech analysis; Speech coding; Speech recognition;
  • fLanguage
    English
  • Journal_Title
    Acoustics, Speech and Signal Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0096-3518
  • Type

    jour

  • DOI
    10.1109/TASSP.1976.1162800
  • Filename
    1162800