Title :
Text Document Classification and Pattern Recognition
Author :
Wu, Qin ; Fuller, Eddie ; Zhang, Cun-Quan
Author_Institution :
Dept. of Math., West Virginia Univ., Morgantown, WV, USA
Abstract :
In this extended abstract, a novel approach is proposed for text pattern recognition. Instead of the traditional models which are mainly based on the frequency of keywords for text document classification, we introduce a new graph theory model which is constructed based on both information about frequency and position of keywords. We applied this new idea to the detection of fraudulent emails written by the same person, and plagiarized publications. The results on these case studies show that this new method performs much better than traditional methods.
Keywords :
electronic mail; graph theory; pattern classification; text analysis; fraudulent email detection; graph theory model; keyword frequency; keyword position; plagiarized publications; text document classification; text pattern recognition; Clustering algorithms; Data mining; Frequency estimation; Graph theory; Internet; Mathematics; Pattern analysis; Pattern recognition; Social network services; Testing; Data mining; Graph model; Pattern Recognition; Similarity measures; Text Pattern; Text analysis;
Conference_Titel :
Social Network Analysis and Mining, 2009. ASONAM '09. International Conference on Advances in
Conference_Location :
Athens
Print_ISBN :
978-0-7695-3689-7
DOI :
10.1109/ASONAM.2009.21