DocumentCode
2803154
Title
New resources for Brazilian Portuguese: Results for grapheme-to-phoneme and phone classification
Author
Hosn, Chadia ; Baptista, Luiz Alberto ; Imbiriba, Tales ; Klautau, Aldebaro
Author_Institution
Fed. Univ. of Para, Belem-PA
fYear
2006
fDate
3-6 Sept. 2006
Firstpage
477
Lastpage
482
Abstract
Speech processing is a data-driven technology that relies on public corpora and associated resources. In contrast to languages such as English, there are few resources for Brazilian Portuguese (BP). Consequently, there are no publicly available scripts to design baseline BP systems. This work discusses some efforts towards decreasing this gap and presents results for two speech processing tasks for BP: phone classification and grapheme to phoneme (G2P) conversion. The former task used hidden Markov models to classify phones from the Spoltech and TIMIT corpora. The G2P module adopted machine learning methods such as decision trees and was tested on a new BP pronunciation dictionary and the following languages: British English, American English and French.
Keywords
hidden Markov models; learning (artificial intelligence); natural languages; speech processing; Brazilian Portuguese; baseline BP system; data-driven technology; grapheme-to-phoneme conversion; hidden Markov model; machine learning method; phone classification; speech processing; Classification tree analysis; Decision trees; Dictionaries; Hidden Markov models; Learning systems; Natural languages; Speaker recognition; Speech processing; Speech recognition; Testing; Grapheme-to-phoneme; decision trees; hidden Markov models; letter-to-sound; phone classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Telecommunications Symposium, 2006 International
Conference_Location
Fortaleza, Ceara
Print_ISBN
978-85-89748-04-9
Electronic_ISBN
978-85-89748-04-9
Type
conf
DOI
10.1109/ITS.2006.4433322
Filename
4433322
Link To Document