مرکز منطقه ای اطلاع رساني علوم و فناوري - Learning a better representation of speech soundwaves using restricted boltzmann machines

DocumentCode :

2182775

Title :

Learning a better representation of speech soundwaves using restricted boltzmann machines

Author :

Jaitly, Navdeep ; Hinton, Geoffrey

Author_Institution :

Dept. of Comput. Sci., Univ. of Toronto, Toronto, ON, Canada

fYear :

2011

fDate :

22-27 May 2011

Firstpage :

5884

Lastpage :

5887

Abstract :

State of the art speech recognition systems rely on preprocessed speech features such as Mel cepstrum or linear predictive coding coefficients that collapse high dimensional speech sound waves into low dimensional encodings. While these have been successfully applied in speech recognition systems, such low dimensional encodings may lose some relevant information and express other information in a way that makes it difficult to use for discrimination. Higher dimensional encodings could both improve performance in recognition tasks, and also be applied to speech synthesis by better modeling the statistical structure of the sound waves. In this paper we present a novel approach for modeling speech sound waves using a Restricted Boltzmann machine (RBM) with a novel type of hidden variable and we report initial results demonstrating phoneme recognition performance better than the current state-of-the-art for methods based on Mel cepstrum coefficients.

Keywords :

Boltzmann machines; speech recognition; speech synthesis; Mel cepstrum coefficient; hidden variable; phoneme recognition; restricted Boltzmann machine; speech recognition system; speech sound wave; speech synthesis; Artificial neural networks; Encoding; Hidden Markov models; Mathematical model; Speech; Speech recognition; Training; RBM; Restricted Boltzmann Machine; TIMIT; phoneme recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on

Conference_Location :

Prague

ISSN :

1520-6149

Print_ISBN :

978-1-4577-0538-0

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2011.5947700

Filename :

5947700

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2182775