Title :
Language processing for name and address reading in Hungarian
Author :
Németh, Géza ; Zainkó, Csaba ; Kiss, Géza ; Fék, Márk ; Gordos, Géza ; Olaszy, Gábor
Author_Institution :
Dept. of Telecommun. & Media Informatics, Budapest Univ. of Technol. & Econ., Hungary
Abstract :
Name and address reading is an important combined application area of language processing and text-to-speech (TTS) systems. It is the cornerstone of both traditional reverse directory telephone services and new, location based, traffic and tour guide applications. The language processing aspects of a solution for Hungarian is described. The work was based on the analysis of a subscriber database containing about 3 million records (there are about 10 million Hungarian citizens). Categories of name and address elements were defined. A program for the automatic classification of database records was developed. Statistical parameters were derived about proper/legal names and addresses. Based on these results text corpora for enriching the TTS acoustic database were designed. Reading strategies and related special algorithms and tables were developed for the description of complex name categories. Our results may be applied for similar tasks of other languages with comparable linguistic and statistical features.
Keywords :
database management systems; linguistics; natural languages; speech synthesis; Hungarian; Hungarian citizen; TTS acoustic database; address reading; automatic syllabification; corpus analysis; directory telephone service; language processing; name reading; speech synthesis; statistical parameter; subscriber database; text-to-speech system; tour guide application; Acoustic applications; Continuous improvement; Databases; GSM; Humans; Informatics; Laboratories; Natural languages; Speech synthesis; Telephony;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003 International Conference on
Conference_Location :
Beijing, China
Print_ISBN :
0-7803-7902-0
DOI :
10.1109/NLPKE.2003.1275906