• DocumentCode
    3280286
  • Title

    Text Document Classification and Pattern Recognition

  • Author

    Wu, Qin ; Fuller, Eddie ; Zhang, Cun-Quan

  • Author_Institution
    Dept. of Math., West Virginia Univ., Morgantown, WV, USA
  • fYear
    2009
  • fDate
    20-22 July 2009
  • Firstpage
    405
  • Lastpage
    410
  • Abstract
    In this extended abstract, a novel approach is proposed for text pattern recognition. Instead of the traditional models which are mainly based on the frequency of keywords for text document classification, we introduce a new graph theory model which is constructed based on both information about frequency and position of keywords. We applied this new idea to the detection of fraudulent emails written by the same person, and plagiarized publications. The results on these case studies show that this new method performs much better than traditional methods.
  • Keywords
    electronic mail; graph theory; pattern classification; text analysis; fraudulent email detection; graph theory model; keyword frequency; keyword position; plagiarized publications; text document classification; text pattern recognition; Clustering algorithms; Data mining; Frequency estimation; Graph theory; Internet; Mathematics; Pattern analysis; Pattern recognition; Social network services; Testing; Data mining; Graph model; Pattern Recognition; Similarity measures; Text Pattern; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Social Network Analysis and Mining, 2009. ASONAM '09. International Conference on Advances in
  • Conference_Location
    Athens
  • Print_ISBN
    978-0-7695-3689-7
  • Type

    conf

  • DOI
    10.1109/ASONAM.2009.21
  • Filename
    5231807