DocumentCode
3280286
Title
Text Document Classification and Pattern Recognition
Author
Wu, Qin ; Fuller, Eddie ; Zhang, Cun-Quan
Author_Institution
Dept. of Math., West Virginia Univ., Morgantown, WV, USA
fYear
2009
fDate
20-22 July 2009
Firstpage
405
Lastpage
410
Abstract
In this extended abstract, a novel approach is proposed for text pattern recognition. Instead of the traditional models which are mainly based on the frequency of keywords for text document classification, we introduce a new graph theory model which is constructed based on both information about frequency and position of keywords. We applied this new idea to the detection of fraudulent emails written by the same person, and plagiarized publications. The results on these case studies show that this new method performs much better than traditional methods.
Keywords
electronic mail; graph theory; pattern classification; text analysis; fraudulent email detection; graph theory model; keyword frequency; keyword position; plagiarized publications; text document classification; text pattern recognition; Clustering algorithms; Data mining; Frequency estimation; Graph theory; Internet; Mathematics; Pattern analysis; Pattern recognition; Social network services; Testing; Data mining; Graph model; Pattern Recognition; Similarity measures; Text Pattern; Text analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Social Network Analysis and Mining, 2009. ASONAM '09. International Conference on Advances in
Conference_Location
Athens
Print_ISBN
978-0-7695-3689-7
Type
conf
DOI
10.1109/ASONAM.2009.21
Filename
5231807
Link To Document