Title :
ASCII based transcription systems for languages with the Arabic script: the case of Persian
Author :
Ganjavi, Shadi ; Georgiou, Panayiotis G. ; Narayanan, Shrikanth
Author_Institution :
Dept. of Linguistics, Univ. of Southern California, CA, USA
fDate :
30 Nov.-3 Dec. 2003
Abstract :
We discuss transcription systems needed for automated spoken language processing applications in languages such as Persian that use the Arabic script for writing. The work is described in the context of a speech-to-speech translation system development for English and Persian. This system can easily be modified for Arabic, Dari, Urdu and any other language that uses the Arabic script. The proposed system has two components. One is a phonemic based transcription of sounds for acoustic modeling in automatic speech recognizers and for text-to-speech synthesizers, using ASCII based symbols, rather than International Phonetic Alphabet symbols. The other is a hybrid system, that provides a minimally-ambiguous lexical representation that explicitly includes vocalic information; such a representation is needed for language modeling and machine translation.
Keywords :
language translation; linguistics; natural language interfaces; natural languages; signal representation; speech recognition; speech synthesis; speech-based user interfaces; text analysis; ASCII based transcription systems; Arabic script; English language; International Phonetic Alphabet symbols; Persian language; automated spoken language processing; automatic speech recognition; language modeling; machine translation; sound transcription; speech-to-speech translation system; text-to-speech synthesizers; Automatic speech recognition; Computer aided software engineering; Laboratories; Natural languages; Speech analysis; Speech recognition; Speech synthesis; Synthesizers; Text recognition; Writing;
Conference_Titel :
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN :
0-7803-7980-2
DOI :
10.1109/ASRU.2003.1318507