DocumentCode :
2851224
Title :
Audio Indexing of Arabic broadcast news
Author :
Billa, J. ; Noamany, M. ; Srivastava, A. ; Liu, D. ; Stone, R. ; Xu, J. ; Makhoul, J. ; Kubala, F.
Author_Institution :
BBN Technologies, Cambridge MA 02138, USA
Volume :
1
fYear :
2002
fDate :
13-17 May 2002
Abstract :
This paper describes the development of the BBN Audio Indexing System for broadcast news in Arabic. Key issues addressed in this work revolve around the three major components of the audio indexing system: automatic speech recognition, speaker identification, and named entity identification. The system deals with several challenges introduced by the Arabic language, including the absence of short vowels in written text and the presence of compound words that are formed by the concatenation of certain conjunctions, prepositions, articles, and pronouns, as prefixes and suffixes to the word stem. The lack of short vowels in the transcripts prompted a novel solution that further demonstrated the power of hidden Markov models to deal with ambiguity. Another challenge was the acquisition of appropriate language modeling data, given the absence of broadcast news data for that purpose. We present performance results for all three components of the Audio Indexing System, which we believe represent the state of the art for Arabic broadcast news.
Keywords :
Biomedical monitoring; Electric breakdown; Error analysis; Indexing; Speech recognition; TV; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location :
Orlando, FL, USA
ISSN :
1520-6149
Print_ISBN :
0-7803-7402-9
Type :
conf
DOI :
10.1109/ICASSP.2002.5743640
Filename :
5743640
Link To Document :
بازگشت