DocumentCode
1833096
Title
Diacritization, automatic segmentation and labeling for Levantine Arabic speech
Author
Alotaibi, Yousef A. ; Meftah, Ali H. ; Selouani, Sid-Ahmed
Author_Institution
Coll. of Comput. & Inf. Sci., King Saud Univ., Riyadh, Saudi Arabia
fYear
2013
fDate
11-14 Aug. 2013
Firstpage
7
Lastpage
11
Abstract
It is generally acknowledged that a reliable speech corpus is necessary for any application involving speech processing. In this paper, we propose methods to improve the BBN/AUB DARPA Babylon Levantine Arabic speech corpus to increase its reliability and efficiency. For this purpose, correction of pronunciation, diacritization, and new transcription are performed manually along with automatic phoneme segmentation and labeling. The comparison with the original transcription of the corpus shows a clear improvement in the output results.
Keywords
natural language processing; speech processing; BBN-AUB DARPA Babylon Levantine Arabic speech corpus; automatic phoneme labeling; automatic phoneme segmentation; diacritization correction; pronunciation correction; speech processing; transcription; Educational institutions; Hidden Markov models; Labeling; Reliability; Speech; Speech processing; Speech recognition; BBN/AUB; Levantine; diacritics; dialect; transcription;
fLanguage
English
Publisher
ieee
Conference_Titel
Digital Signal Processing and Signal Processing Education Meeting (DSP/SPE), 2013 IEEE
Conference_Location
Napa, CA
Print_ISBN
978-1-4799-1614-6
Type
conf
DOI
10.1109/DSP-SPE.2013.6642556
Filename
6642556
Link To Document