DocumentCode :
553216
Title :
Automatically locating salutation and signature blocks in emails
Author :
Meijuan Yin ; Junyong Luo ; Ding Cao ; Xiaonan Liu ; Mingtao Li
Author_Institution :
Zhengzhou Inf. Sci. & Technol. Inst., Zhengzhou, China
Volume :
3
fYear :
2011
fDate :
26-28 July 2011
Firstpage :
1783
Lastpage :
1787
Abstract :
This paper focuses on the problem of automatically locating salutation and signature blocks in the body of plain-text emails. Texts of salutation and signature block in an email usually contain identity information about the email´s sender or recipients. The analysis of locating and extracting salutation and signature blocks from emails has many potential applications, such as entity attributes extracting, person entity based email social network analysis, anonymization of email corpora, improving automatic content-based email classifiers and email threading. Our approach is based on the statistical method and the rules restricted method, which can greatly improve the locating efficiency and at the same time promise a relatively high accuracy of the extracted blocks. We use the statistical method to roughly estimate the number of lines in salutation and signature blocks, and introduce some restriction rules to refine the lines located by the statistical method. Results on the public subset of the Enron corpus prove the high performance of our approach with the average F1 value above 94%.
Keywords :
data mining; e-mail filters; statistical analysis; Enron corpus; automatic content-based email classifiers; automatic salutation location; automatical signature block location; email corpora anonymization; email threading; entity attribute extraction; plain-text emails; rules restricted method; social network analysis; statistical method; Accuracy; Algorithm design and analysis; Electronic mail; Feature extraction; Learning systems; Social network services; Statistical analysis; email body analysis; locating; salutation blocks; signature blocks;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-61284-180-9
Type :
conf
DOI :
10.1109/FSKD.2011.6019891
Filename :
6019891
Link To Document :
بازگشت