DocumentCode
2301361
Title
Automatically Detecting Personal Topics by Clustering Emails
Author
Yang, Huijie ; Luo, Junyong ; Yin, Meijuan ; Liu, Yan
Author_Institution
Inf. Sci. & Technol. Inst., Zhengzhou, China
Volume
3
fYear
2010
fDate
6-7 March 2010
Firstpage
91
Lastpage
94
Abstract
Emails play an important role in our daily life. It has been recognized that clustering emails into meaningful groups can greatly save cognitive load to process emails. Mailbox user becomes more and more concerned about how to organize and manage the emails as well as how to mine the meaningful data conveniently and effectively. This paper proposes a novel personal topics detection approach using clustering algorithm. First preprocess the emails and construct the improved email VSM(vector space model) to label the email combining the body and subject in a new method, then adopt the advanced k-means algorithm to cluster the emails and design a kernel-selected algorithm based on the lowest similarity, afterwards we get the appropriate keywords to label the topic of each cluster. Finally, experiments on 20Newsgruops email dataset show the validity of our approach and the experimental results also well match the labeled human clustering result.
Keywords
electronic mail; pattern clustering; automatic personal topic detection; clustering emails; email management; email organization; k-means algorithm; kernel-selected algorithm; vector space model; Algorithm design and analysis; Clustering algorithms; Computer science; Computer science education; Data mining; Educational technology; Humans; Information science; Natural languages; Speech recognition; Email VSM; email clustering; kernel-selected; topic detection;
fLanguage
English
Publisher
ieee
Conference_Titel
Education Technology and Computer Science (ETCS), 2010 Second International Workshop on
Conference_Location
Wuhan
Print_ISBN
978-1-4244-6388-6
Electronic_ISBN
978-1-4244-6389-3
Type
conf
DOI
10.1109/ETCS.2010.238
Filename
5459924
Link To Document