DocumentCode :
2465577
Title :
Client-side deep Web data extraction
Author :
Álvarez, Manuel ; Pan, Alberto ; Raposo, Juan ; Viña, Angel
Author_Institution :
Dept. of Inf. & Commun. Technol., Univ. of A Coruna
fYear :
2004
fDate :
15-15 Sept. 2004
Firstpage :
158
Lastpage :
161
Abstract :
The problem of data extraction from the deep Web can be divided into two tasks: crawling the client-side and the server-side deep Web. The objective is to define an architecture and a set of related techniques to access the information placed in the client-side deep Web. This involves dealing with aspects such as JavaScript technology, nonstandard session maintenance mechanisms, client redirections, pop-up menus, etc. We use current browser APIs as building blocks and leverage them to implement novel crawling models and algorithms
Keywords :
Internet; Java; application program interfaces; client-server systems; information retrieval; online front-ends; user interfaces; JavaScript technology; browser API; client redirections; client-side deep Web data extraction; nonstandard session maintenance mechanisms; pop-up menus; server-side deep Web; Communications technology; Crawlers; Data mining; Java; Navigation; Service oriented architecture; Uniform resource locators; Web page design; Web pages; World Wide Web;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
E-Commerce Technology for Dynamic E-Business, 2004. IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
0-7695-2206-8
Type :
conf
DOI :
10.1109/CEC-EAST.2004.30
Filename :
1388317
Link To Document :
بازگشت