• DocumentCode
    2417889
  • Title

    Automatic Information Extraction from E-Commerce Web Sites

  • Author

    Qiu, Taofen ; Yang, Tianqi

  • Author_Institution
    Dept. of Comput., Jinan Univ., Guangzhou, China
  • fYear
    2010
  • fDate
    7-9 May 2010
  • Firstpage
    1399
  • Lastpage
    1402
  • Abstract
    With the rapid development of e-commerce, online transactions has become an important part in people´s lives, in order to support the smooth development of e-commerce activities, how to provide users with efficient and practical product information has become an urgent and critical problem. This paper presents a set of novel techniques based on page similarity measure, page clustering and wrapper generation to automatically extract data from E-Commerce web sites. Experiments on real web sources show the effectiveness of the proposed technique.
  • Keywords
    Web sites; electronic commerce; information filtering; pattern clustering; automatic information extraction; e-commerce Web sites; online transactions; page clustering; page similarity measure; wrapper generation; Business; Data mining; Feature extraction; HTML; Web pages; XML; E-Commerce; information extraction; page clustering; template generation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    E-Business and E-Government (ICEE), 2010 International Conference on
  • Conference_Location
    Guangzhou
  • Print_ISBN
    978-0-7695-3997-3
  • Type

    conf

  • DOI
    10.1109/ICEE.2010.355
  • Filename
    5591749