Title :
Information Extraction of Forum Based on Regular Expression
Author :
Gang He ; Yingwei Zhang ; Xiaochun Wu
Author_Institution :
Beijing Key Lab. of Network Syst. Archit. & Convergence, Beijing Univ. of Posts & Telecommun., Beijing, China
Abstract :
This paper introduces the popular universal forum systems in domestic mainstream forum and analyzes the unique characteristics of these forum systems. Based on these unique characteristics, we propose the concept of system fingerprint which can used to detect the different systems of forum exactly and extract the users´ information efficiently. It contributes to the development of network information auditing. Experimental results show that the approach can achieve high extraction accuracy. It has important application value and practical significance.
Keywords :
Web sites; information retrieval; domestic mainstream forum; forum information extraction; network information auditing; popular universal forum systems; regular expression; system fingerprint; Data mining; Digital video broadcasting; Feature extraction; Fingerprint recognition; Information retrieval; Internet; Lead; forum system fingerprint; information extraction; regular expression matching;
Conference_Titel :
Intelligent Human-Machine Systems and Cybernetics (IHMSC), 2013 5th International Conference on
Conference_Location :
Hangzhou
Print_ISBN :
978-0-7695-5011-4
DOI :
10.1109/IHMSC.2013.175