DocumentCode :
1985791
Title :
Web information extraction
Author :
Lam, Man I. ; Gong, Zhiguo
Author_Institution :
Fac. of Sci. & Technol., Univ. of Macau, Macao, China
fYear :
2005
fDate :
27 June-3 July 2005
Abstract :
Along with the continuous development of the Internet technologies, Web pages can provide a huge amount of information resource. It alters the traditional way of preserving and searching information. The queries target to the Web page becomes huge and more and more important. Now a day, search engine is a very popular method to search information on the Web. However, it only presents a list of documents other than the specific answers or piece of knowledge for the user´s specific question. Therefore, the data extraction from the Web is becoming a hot topic. In this paper, we investigate the current development in the Web data extraction, the difficulties, and the objectives. In addition, we illustrate and analyze some examples and provide our solution for information extraction from the Web.
Keywords :
Internet; information retrieval; search engines; Internet; Web information extraction; Web pages; information resource; search engine; Data mining; Data warehouses; HTML; Information analysis; Information resources; Internet; Markup languages; Search engines; Web pages; XML;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Acquisition, 2005 IEEE International Conference on
Print_ISBN :
0-7803-9303-1
Type :
conf
DOI :
10.1109/ICIA.2005.1635157
Filename :
1635157
Link To Document :
بازگشت