DocumentCode :
404845
Title :
Phoneme alignment of Filipino speech corpus
Author :
Sagum, Ramil G. ; Ensomo, Ryan A. ; Tan, Emerson M. ; Guevara, Rowena Cnstina L
Author_Institution :
Dept. of Electr. & Electron. Eng., Philippines Univ., Quezon City, Philippines
Volume :
3
fYear :
2003
fDate :
15-17 Oct. 2003
Firstpage :
964
Abstract :
Segmentation and transcription of a speech corpus is a prerequisite in the development of an automatic speech recognition (ASR) system. In this paper, we develop a method for automatically segmenting and transcribing the Filipino speech corpus that is being developed at the DSP laboratory. A multi-layer perceptron (MLP) will take speech feature inputs, multiply them by weights computed from a training set of labeled speech. The system is based on a multi-layer perceptron and start synchronous decoder. The corpus was divided into three subcorpora, the paragraphs and sentences sub-corpus (par+sen), the words sub-corpus and the syllables sub-corpus. For the par+sen sub-corpus, we obtained a 62.64% phoneme recognition rate with 75.68% of labels within 20 ms of hand-labeled transcriptions; for the words-subcorpus, 63.93% phoneme recognition rate with 72/38% within 20 ms of hand-labeled transcriptions; and for the syllables sub-corpus, 72.60% phoneme recognition rate with 75.69% within 20 ms of hand-labeled transcriptions.
Keywords :
multilayer perceptrons; natural languages; speech processing; speech recognition; Filipino speech corpus; automatic speech recognition system; multilayer perceptron; phoneme alignment; start synchronous decoder; Automatic speech recognition; Decoding; Digital signal processing; Hidden Markov models; Laboratories; Multilayer perceptrons; Natural languages; Neural networks; Speech processing; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
TENCON 2003. Conference on Convergent Technologies for the Asia-Pacific Region
Print_ISBN :
0-7803-8162-9
Type :
conf
DOI :
10.1109/TENCON.2003.1273390
Filename :
1273390
Link To Document :
بازگشت