• DocumentCode
    62822
  • Title

    Whisper-to-speech conversion using restricted Boltzmann machine arrays

  • Author

    Jing-jie Li ; McLoughlin, Ian Vince ; Li-Rong Dai ; Zhen-Hua Ling

  • Author_Institution
    Univ. of Sci. & Technol. of China, Hefei, China
  • Volume
    50
  • Issue
    24
  • fYear
    2014
  • fDate
    11 20 2014
  • Firstpage
    1781
  • Lastpage
    1782
  • Abstract
    Whispers are a natural vocal communication mechanism, in which vocal cords do not vibrate normally. Lack of glottal-induced pitch leads to low energy, and an inherent noise-like spectral distribution reduces intelligibility. Much research has been devoted to processing of whispers, including conversion of whispers to speech. Unfortunately, among several approaches, the best reconstructed speech to date still contains obviously artificial muffles and suffers from an unnatural prosody. To address these issues, the novel use of multiple restricted Boltzmann machines (RBMs) is reported as a statistical conversion model between whisper and speech spectral envelopes. Moreover, the accuracy of estimated pitch is improved using machine learning techniques for pitch estimation within only voiced (V) regions. Both objective and subjective evaluations show that this new method improves the quality of whisper-reconstructed speech compared with the state-of-the-art approaches.
  • Keywords
    Boltzmann machines; learning (artificial intelligence); speech intelligibility; speech processing; statistical analysis; Gaussian mixture model; RBM arrays; artificial muffle; glottal-induced pitch lead; human-to-human vocal communication mechanism; inherent noise-like spectral distribution; machine learning technique; pitch accuracy; pitch estimation; restricted Boltzmann machine array; speech intelligibility; speech reconstruction; speech spectral envelope; statistical conversion model; unnatural prosody; vocal cord; voiced region; whisper processing; whisper-to-speech conversion;
  • fLanguage
    English
  • Journal_Title
    Electronics Letters
  • Publisher
    iet
  • ISSN
    0013-5194
  • Type

    jour

  • DOI
    10.1049/el.2014.1645
  • Filename
    6969246