• DocumentCode
    1685009
  • Title

    Author Identification in Albanian Language

  • Author

    Paci, Hakik ; Kajo, Elinda ; Trandafili, Evis ; Tafa, Igli ; Salillari, Denisa

  • Author_Institution
    Inf. Eng. Dept., Polytech. Univ. of Tirana, Tirana, Albania
  • fYear
    2011
  • Firstpage
    425
  • Lastpage
    430
  • Abstract
    The identification of authorship has been for a long time the focus of many researchers. A lot of work has been done mostly referring to the English language. There is a gap for the Albanian language in this field of study because of its big differences with other languages according to its difficult syntactic structure. This was our motivation on trying to adapt the algorithm of identifying the authorship of Albanian books. Our previous work concerned the adoption of Dmitri Khmelev algorithm for identifying the authorship of Albanian texts. In this paper we improved the algorithm by taking into account the syntactic structure of Albanian sentences and adding specific linguistic elements to the problem. The results that we obtained by the same set of books were better than the results taken by the basic models of Dmitri Khmelev algorithms.
  • Keywords
    natural language processing; text analysis; Albanian books; Albanian language; Albanian texts; Dmitri Khmelev algorithm; English language; author identification; linguistic elements; syntactic structure; DNA; Databases; Dictionaries; Frequency measurement; Markov processes; Pragmatics; Syntactics; Author identification; Dmitri Khmelev algorithms; syntactic structure;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Network-Based Information Systems (NBiS), 2011 14th International Conference on
  • Conference_Location
    Tirana
  • ISSN
    2157-0418
  • Print_ISBN
    978-1-4577-0789-6
  • Electronic_ISBN
    2157-0418
  • Type

    conf

  • DOI
    10.1109/NBiS.2011.71
  • Filename
    6041950