DocumentCode :
2314839
Title :
A generalized LR parser for text-to-speech synthesis
Author :
Heggtveit, Per Olav
Author_Institution :
Telenor R&D, Kjeller, Norway
Volume :
3
fYear :
1996
fDate :
3-6 Oct 1996
Firstpage :
1429
Abstract :
The development of a parser for a Norwegian text-to-speech system is reported. The generalized left-right (GLR) algorithm is applied, which is a generalization of the well-known LR algorithm for parsing computer languages. This paper briefly describes the GLR algorithm, the integration of a probabilistic scoring model, our implementation of the parser in C++, the attribute structures, the lexical interface, and the application of the parser to part-of-speech (POS) tagging for Norwegian. Applied to a small test set of about 4,000 words, this method correctly tags 96% of the known words, which is close to the performance of other POS-taggers trained on large text databases. 85% of the unknown words are tagged correctly, and the probability of choosing the wrong pronunciation of a word from the lexicon is less than 0.1%
Keywords :
grammars; natural languages; speech synthesis; C++ implementation; Norwegian language; attribute structures; computer languages; generalized left-right parser; incorrect pronunciation probability; large text databases; lexical interface; lexicon; part-of-speech tagging; performance; probabilistic scoring model; text-to-speech synthesis; Application software; Computer languages; Databases; Morphology; Natural languages; Research and development; Robustness; Speech synthesis; Tagging; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
Type :
conf
DOI :
10.1109/ICSLP.1996.607883
Filename :
607883
Link To Document :
بازگشت