DocumentCode :
1871071
Title :
Research and implementation of the technology supporting MicroBlog data collection based on web crawler
Author :
Yuan Xiaohong ; Zhou Sisi
Author_Institution :
College of Computer Science and Information Technology, Central South University of Forestry and Technology, Hunan, Changsha 410004, China
fYear :
2012
fDate :
3-5 March 2012
Firstpage :
1674
Lastpage :
1677
Abstract :
MicroBlog is an effective vehicle for the network public opinion, and plays an important role in dissemination of the public opinion. A crawler which consisted of user crawling and contents crawling used to crawl MicroBlog is designed. The crawler used protocol-driven strategy, event-driven strategy and template extraction methods to achieve the successful extraction and data storage. Experiment shows that the crawler has an efficiency and integrity of information collection compared with the crawler BFS. A more flexible crawler is needed with the more complexity of DOM Tree.
Keywords :
AJAX; MicroBlog; crawler; web information extraction;
fLanguage :
English
Publisher :
iet
Conference_Titel :
Automatic Control and Artificial Intelligence (ACAI 2012), International Conference on
Conference_Location :
Xiamen
Electronic_ISBN :
978-1-84919-537-9
Type :
conf
DOI :
10.1049/cp.2012.1307
Filename :
6492914
Link To Document :
بازگشت