• DocumentCode
    1634505
  • Title

    Author Identification Using Compression Models

  • Author

    Pavelec, D. ; Oliveira, L.S. ; Justino, E. ; Neto, F. D Nobre ; Batista, L.V.

  • Author_Institution
    Pontificia Univ. Catolica do Parana, Curitiba, Brazil
  • fYear
    2009
  • Firstpage
    936
  • Lastpage
    940
  • Abstract
    In this paper we discuss the use of compression algorithms for author identification. We present the basic background about compression algorithms and introduce the prediction by partial matching algorithm, which has been used in our experiments. To better compare the results produced by the PPM algorithm, we present some experiments using stylometric features used very often by forensic examiners. In this case the authors are modeled using support vector machines. Comprehensive experiments performed on a database composed of 20 different authors show that the PPM algorithm is an interesting alternative for author identification, since all the process of feature definition, extraction, and selection can be avoided.
  • Keywords
    data compression; feature extraction; pattern matching; support vector machines; PPM algorithm; author identification; compression algorithm; compression model; feature definition; feature extraction; feature selection; forensic examiner; prediction by partial matching algorithm; stylometric features; support vector machine; Algorithm design and analysis; Compression algorithms; Feature extraction; Forensics; Frequency; History; Pediatrics; Spatial databases; Support vector machines; Text analysis; Author identification; compression models;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
  • Conference_Location
    Barcelona
  • ISSN
    1520-5363
  • Print_ISBN
    978-1-4244-4500-4
  • Electronic_ISBN
    1520-5363
  • Type

    conf

  • DOI
    10.1109/ICDAR.2009.208
  • Filename
    5277555