DocumentCode :
1685009
Title :
Author Identification in Albanian Language
Author :
Paci, Hakik ; Kajo, Elinda ; Trandafili, Evis ; Tafa, Igli ; Salillari, Denisa
Author_Institution :
Inf. Eng. Dept., Polytech. Univ. of Tirana, Tirana, Albania
fYear :
2011
Firstpage :
425
Lastpage :
430
Abstract :
The identification of authorship has been for a long time the focus of many researchers. A lot of work has been done mostly referring to the English language. There is a gap for the Albanian language in this field of study because of its big differences with other languages according to its difficult syntactic structure. This was our motivation on trying to adapt the algorithm of identifying the authorship of Albanian books. Our previous work concerned the adoption of Dmitri Khmelev algorithm for identifying the authorship of Albanian texts. In this paper we improved the algorithm by taking into account the syntactic structure of Albanian sentences and adding specific linguistic elements to the problem. The results that we obtained by the same set of books were better than the results taken by the basic models of Dmitri Khmelev algorithms.
Keywords :
natural language processing; text analysis; Albanian books; Albanian language; Albanian texts; Dmitri Khmelev algorithm; English language; author identification; linguistic elements; syntactic structure; DNA; Databases; Dictionaries; Frequency measurement; Markov processes; Pragmatics; Syntactics; Author identification; Dmitri Khmelev algorithms; syntactic structure;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Network-Based Information Systems (NBiS), 2011 14th International Conference on
Conference_Location :
Tirana
ISSN :
2157-0418
Print_ISBN :
978-1-4577-0789-6
Electronic_ISBN :
2157-0418
Type :
conf
DOI :
10.1109/NBiS.2011.71
Filename :
6041950
Link To Document :
بازگشت