DocumentCode :
1705243
Title :
Variable Length Character N-Gram Approach for Online Writeprint Identification
Author :
Sun, Jianwen ; Yang, Zongkai ; Wang, Pei ; Liu, Sanya
Author_Institution :
Nat. Eng. Res. Center for E-learning, Huazhong Normal Univ., Wuhan, China
fYear :
2010
Firstpage :
486
Lastpage :
490
Abstract :
The Internet´s numerous benefits have always been coupled with shortcomings due to the abuses of online anonymity. Writeprint identification is a technique to identify individuals based on textual identity cues people leave behind online messages. Character n-gram is one of the most effective approaches to identify writeprint according to previous research. In this study, we propose a variable length character n-gram based writeprint identification framework to address the identity tracing problem, integrating a genetic algorithm (GA) based feature selection component to solve the definition problem of n. To examine the approach, experiments are conducted on a test bed encompassing hundreds of reviews posted by 20 Amazon customers. The experimental results show the proposed approach is effective, obtaining a considerable improvement in identification accuracy and a heavy reduction of feature dimensionality.
Keywords :
Internet; genetic algorithms; security of data; text analysis; Amazon customers; Internet; feature selection component; genetic algorithm; identification accuracy; identity tracing problem; online anonymity; online messages; online writeprint identification; textual identity cues; variable length character n-gram approach; Accuracy; Biological cells; Feature extraction; Gallium; Internet; Radio frequency; Training; character n-gram; feature selection; genetic algorithm; writeprint identification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia Information Networking and Security (MINES), 2010 International Conference on
Conference_Location :
Nanjing, Jiangsu
Print_ISBN :
978-1-4244-8626-7
Electronic_ISBN :
978-0-7695-4258-4
Type :
conf
DOI :
10.1109/MINES.2010.109
Filename :
5671087
Link To Document :
بازگشت