Title :
Data Extraction from Deep Web Pages
Author :
Yang, Jufeng ; Shi, Guangshun ; Zheng, Yan ; Wang, Qingren
Abstract :
In this paper, we propose a novel model to extract data from Deep Web pages. The model has four layers, among which the access schedule, extraction layer and data cleaner are based on the rules of structure, logic and application. In the experiment section, we apply the new model to three intelligent system, scientific paper retrieval, electronic ticket ordering and resume searching. The results show that the proposed method is robust and feasible.
Keywords :
Computational intelligence; Crawlers; Data mining; Data security; Internet; Logic; Machine intelligence; Robustness; Scheduling; Web pages;
Conference_Titel :
Computational Intelligence and Security, 2007 International Conference on
Conference_Location :
Harbin, China
Print_ISBN :
0-7695-3072-9
Electronic_ISBN :
978-0-7695-3072-7
DOI :
10.1109/CIS.2007.39