Title :
A novel technique for words reordering based on N-grams
Author :
Athanaselis, Theologos ; Bakamidis, Stelios ; Dologlou, Ioannis
Author_Institution :
Inst. for Language & Speech Process., Athens
Abstract :
This paper presents an approach for repairing word order errors in English text by reordering words in a sentence and choosing the version that maximizes the number of trigram hits according to a language model. The novelty of this method concerns the use of an efficient confusion matrix technique for reordering the words. For further reducing the number of permutations the use of unigramspsila probability is used. The comparative advantage of this method is that works with a large set of words, and avoids the laborious and costly process of collecting word order errors for creating error patterns.
Keywords :
natural languages; probability; text analysis; English text; confusion matrix; language model; trigram hits; word order errors; words reordering; Computer errors; Internet; Machine learning algorithms; Natural languages; Probability; Search engines; Speech processing; Testing; Text recognition; Writing;
Conference_Titel :
Signal Processing and Its Applications, 2007. ISSPA 2007. 9th International Symposium on
Conference_Location :
Sharjah
Print_ISBN :
978-1-4244-0778-1
Electronic_ISBN :
978-1-4244-1779-8
DOI :
10.1109/ISSPA.2007.4555284