DocumentCode
1634505
Title
Author Identification Using Compression Models
Author
Pavelec, D. ; Oliveira, L.S. ; Justino, E. ; Neto, F. D Nobre ; Batista, L.V.
Author_Institution
Pontificia Univ. Catolica do Parana, Curitiba, Brazil
fYear
2009
Firstpage
936
Lastpage
940
Abstract
In this paper we discuss the use of compression algorithms for author identification. We present the basic background about compression algorithms and introduce the prediction by partial matching algorithm, which has been used in our experiments. To better compare the results produced by the PPM algorithm, we present some experiments using stylometric features used very often by forensic examiners. In this case the authors are modeled using support vector machines. Comprehensive experiments performed on a database composed of 20 different authors show that the PPM algorithm is an interesting alternative for author identification, since all the process of feature definition, extraction, and selection can be avoided.
Keywords
data compression; feature extraction; pattern matching; support vector machines; PPM algorithm; author identification; compression algorithm; compression model; feature definition; feature extraction; feature selection; forensic examiner; prediction by partial matching algorithm; stylometric features; support vector machine; Algorithm design and analysis; Compression algorithms; Feature extraction; Forensics; Frequency; History; Pediatrics; Spatial databases; Support vector machines; Text analysis; Author identification; compression models;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
Conference_Location
Barcelona
ISSN
1520-5363
Print_ISBN
978-1-4244-4500-4
Electronic_ISBN
1520-5363
Type
conf
DOI
10.1109/ICDAR.2009.208
Filename
5277555
Link To Document