Title :
Automatic Information Extraction from E-Commerce Web Sites
Author :
Qiu, Taofen ; Yang, Tianqi
Author_Institution :
Dept. of Comput., Jinan Univ., Guangzhou, China
Abstract :
With the rapid development of e-commerce, online transactions has become an important part in people´s lives, in order to support the smooth development of e-commerce activities, how to provide users with efficient and practical product information has become an urgent and critical problem. This paper presents a set of novel techniques based on page similarity measure, page clustering and wrapper generation to automatically extract data from E-Commerce web sites. Experiments on real web sources show the effectiveness of the proposed technique.
Keywords :
Web sites; electronic commerce; information filtering; pattern clustering; automatic information extraction; e-commerce Web sites; online transactions; page clustering; page similarity measure; wrapper generation; Business; Data mining; Feature extraction; HTML; Web pages; XML; E-Commerce; information extraction; page clustering; template generation;
Conference_Titel :
E-Business and E-Government (ICEE), 2010 International Conference on
Conference_Location :
Guangzhou
Print_ISBN :
978-0-7695-3997-3
DOI :
10.1109/ICEE.2010.355