Authorship attribution of text samples using neural networks and Bayesian classifiers

Author

Kjell, Bradley

Author_Institution

Dept. of Comput. Sci., Central Connecticut State Univ., New Britain, CT, USA

Volume

2

fYear

1994

fDate

2-5 Oct 1994

Firstpage

1660

Abstract

Previous work has shown that statistics of letter pairs extracted from text samples can be effective in discriminating between two authors writing in a similar style. This paper extends that work by using n-tuples for n from 1 to 5. The features used in classification are the relative frequencies of the tuples, transformed with a KL transform. Both three layer neural network classifiers and Bayesian classifiers are used with these features to classify text samples from two similar authors. The most effective combination was 2-tuples used with a neural network classifier, although other combinations did nearly as well

Keywords

Bayes methods; document handling; feature extraction; feedforward neural nets; pattern classification; statistical analysis; Bayesian classifiers; KL transform; authorship attribution; classification; feature extraction; multilayer neural network classifiers; text samples; tuples; writing style; Bayesian methods; Computer science; Concatenated codes; Displays; Frequency; Karhunen-Loeve transforms; Neural networks; Statistics; Testing; Writing;

fLanguage

English

Publisher

ieee

Conference_Titel

Systems, Man, and Cybernetics, 1994. Humans, Information and Technology., 1994 IEEE International Conference on

Conference_Location

San Antonio, TX

Print_ISBN

0-7803-2129-4

Type

conf

DOI

10.1109/ICSMC.1994.400086

Filename

400086