• DocumentCode
    1679492
  • Title

    Authorship attribution using function words adjacency networks

  • Author

    Segarra, Santiago ; Eisen, Mark ; Ribeiro, Alejandro

  • Author_Institution
    Dept. of Electr. & Syst. Eng., Univ. of Pennsylvania, Philadelphia, PA, USA
  • fYear
    2013
  • Firstpage
    5563
  • Lastpage
    5567
  • Abstract
    We present an authorship attribution method based on relational data between function words. These are content independent words that help define grammatical relationships. As relational structures we use normalized word adjacency networks. We interpret these networks as Markov chains and compare them using entropy measures. We illustrate the accuracy of the method developed through a series of numerical experiments including comparisons with frequency based methods. We show that accuracy increases when combining relational and frequency based data, indicating that both sources of information encode different aspects of authorial styles.
  • Keywords
    Markov processes; entropy; relational databases; text analysis; word processing; Markov chains; authorial styles; authorship attribution method; content independent words; entropy measures; frequency-based data; function words; grammatical relationships; information sources; normalized word adjacency networks; relational data; relational structures; Accuracy; Entropy; Frequency measurement; Markov processes; Pragmatics; Support vector machines; Wide area networks; Authorship attribution; Markov chain; relative entropy; word adjacency network;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2013.6638728
  • Filename
    6638728