• DocumentCode
    2887880
  • Title

    Adding Voicing Features into Speech Recognition Based on HMM in Slovak

  • Author

    Kacur, Juraj ; Rozinaj, Gregor

  • Author_Institution
    Dept. of Telecommun., STU, Bratislava, Slovakia
  • fYear
    2009
  • fDate
    18-20 June 2009
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    This article discusses the impact of substituting some of the basic speech features with the voiced/ unvoiced information and possibly with the estimated pitch value. As a good measure of the signal´s voicing the average magnitude difference function was assumed, especially the ratio of its average value to its local minima found within the accepted ranges of the pitch. Furthermore, the pitch itself was used as an auxiliary feature to the base MFCC and PLP features. Experiments were performed on the professional database SPEECHDAT-SK for mobile applications working in harsh conditions, using various HMM models of context dependent and independent phonemes. All models were trained following the MASPER training scheme. In all cases the voicing feature brought improved results by more than 9% compared to the base systems. However the role of the pitch itself in the case of speaker independent ASR system evaluated over different tasks was not always so beneficial.
  • Keywords
    natural language processing; speech recognition; HMM; MASPER training scheme; MFCC; PLP features; SPEECHDAT-SK; Slovak; automatic speech recognition system; context dependent phonemes; context independent phonemes; mobile applications; pitch value estimation; professional database; voicing feature addition; Acceleration; Automatic speech recognition; Context modeling; Data mining; Frequency estimation; Hidden Markov models; Mel frequency cepstral coefficient; Robustness; Spatial databases; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Signals and Image Processing, 2009. IWSSIP 2009. 16th International Conference on
  • Conference_Location
    Chalkida
  • Print_ISBN
    978-1-4244-4530-1
  • Electronic_ISBN
    978-1-4244-4530-1
  • Type

    conf

  • DOI
    10.1109/IWSSIP.2009.5367743
  • Filename
    5367743