• DocumentCode
    2632671
  • Title

    Sequential Pattern Mining for Chinese E-mail Authorship Identification

  • Author

    Ma, Jianbin ; Li, Ying ; Teng, Guifa ; Fang Wang ; Zhao, Yang

  • Author_Institution
    Sch. of Inf. Sci. & Technol., Agric. Univ. of Hebei, Baoding
  • fYear
    2008
  • fDate
    18-20 June 2008
  • Firstpage
    73
  • Lastpage
    73
  • Abstract
    With the rapid growth in computer technology and popularization of Internet, e-mail has become one economical and convenient form of communication. But different types of crime and civil action involving e-mail documents appear which do harm to people´s life and social´s stabilization. So the criminal e-mail´s authorship has to be identified automatically for the purpose of computer forensic. To solve the problem, the appropriate feature extraction and selection methods are essential. Unlike English and other IndoEuropean languages, Chinese text does not have a natural delimiter between words. Word segmentation is a major problem in Chinese text processing. So in this paper, sequential pattern feature mining methods were described without word segmentation. The support vector machine algorithm was adopted as classification algorithm. The experiments on limited samples gained satisfying results, which proved that the sequential pattern feature mining methods were effective.
  • Keywords
    computer crime; feature extraction; pattern classification; support vector machines; text analysis; unsolicited e-mail; word processing; Chinese e-mail authorship identification; Chinese text processing; computer forensic; feature extraction; feature selection; sequential pattern mining; support vector machine algorithm; Data mining; Electronic mail; Feature extraction; Forensics; Information science; Internet; Natural languages; Postal services; Sequences; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Innovative Computing Information and Control, 2008. ICICIC '08. 3rd International Conference on
  • Conference_Location
    Dalian, Liaoning
  • Print_ISBN
    978-0-7695-3161-8
  • Electronic_ISBN
    978-0-7695-3161-8
  • Type

    conf

  • DOI
    10.1109/ICICIC.2008.489
  • Filename
    4603262