Title :
Author Identification in Albanian Language
Author :
Paci, Hakik ; Kajo, Elinda ; Trandafili, Evis ; Tafa, Igli ; Salillari, Denisa
Author_Institution :
Inf. Eng. Dept., Polytech. Univ. of Tirana, Tirana, Albania
Abstract :
The identification of authorship has been for a long time the focus of many researchers. A lot of work has been done mostly referring to the English language. There is a gap for the Albanian language in this field of study because of its big differences with other languages according to its difficult syntactic structure. This was our motivation on trying to adapt the algorithm of identifying the authorship of Albanian books. Our previous work concerned the adoption of Dmitri Khmelev algorithm for identifying the authorship of Albanian texts. In this paper we improved the algorithm by taking into account the syntactic structure of Albanian sentences and adding specific linguistic elements to the problem. The results that we obtained by the same set of books were better than the results taken by the basic models of Dmitri Khmelev algorithms.
Keywords :
natural language processing; text analysis; Albanian books; Albanian language; Albanian texts; Dmitri Khmelev algorithm; English language; author identification; linguistic elements; syntactic structure; DNA; Databases; Dictionaries; Frequency measurement; Markov processes; Pragmatics; Syntactics; Author identification; Dmitri Khmelev algorithms; syntactic structure;
Conference_Titel :
Network-Based Information Systems (NBiS), 2011 14th International Conference on
Conference_Location :
Tirana
Print_ISBN :
978-1-4577-0789-6
Electronic_ISBN :
2157-0418
DOI :
10.1109/NBiS.2011.71