Title :
Efficient implementation of morphological finite-state transition networks employing their statistical properties
Author :
Glushnev, N. ; O´Donovan, B. ; Troussov, A.
Abstract :
Here we study numerically the structure of directed state transition graphs for several types of finite-state devices representing morphology of 16 languages. In all numerical experiments we have found that the distribution of incoming and outcoming links is highly skewed and is modelled well by the power law, not by the Poisson distribution typical for classical random graphs. Studied for three languages, distribution of nodes according to the traffic they experience during corpora processing obeys the power law as well. Traffic and out-degree are the parameters, which affect performance of finite-state devices. We discuss how specific properties of power law, like distribution of these parameters (coexistence of small number of "hubs" with large number of "small events"), can be exploited for efficient computer implementation of finite-state devices used in morphology.
Keywords :
computational linguistics; directed graphs; finite state machines; natural languages; statistical analysis; directed state transition graphs; finite-state devices; finite-state transition networks; power law; statistical property; Application software; Automata; Computational linguistics; Distributed computing; Graph theory; Information analysis; Natural language processing; Surface morphology; Traffic control; Transducers;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003 International Conference on
Conference_Location :
Beijing, China
Print_ISBN :
0-7803-7902-0
DOI :
10.1109/NLPKE.2003.1275868