Title :
Detection of E-Commerce Systems with Sparse Features and Supervised Classification
Author :
Stoll, Kurt Uwe ; Hepp, Martin
Author_Institution :
E-Bus. & Web Sci. Res. Group, Univ. der Bundeswehr Munchen, Neubiberg, Germany
Abstract :
Enriching web shop pages with structured data has recently become popular in e-commerce. It is mainly driven by search engines favouring those pages. While structured data in e-commerce is mainly generated automatically by shop extensions, this data covers only a small share of the market, resulting in a major hamper for applications operating on aggregated data. In this context, more than 90% of product detail pages on the web are generated by only 7 e-commerce systems. Meanwhile, little research addresses methods to automatically detect e-commerce systems. Automated detection would allow to design system-specific extractors able to grow the amount of structured data in e-commerce. Therefore, we propose a novel approach to this problem, which filters features generated from HTML tag attributes with an e-commerce specific white list. We evaluate 6 classification algorithms on the problem and discuss computational effort. We can show that this approach is capable of detecting the 6 most important e-commerce systems with a F1-score of 0.9 by analyzing only one HTML page per web shop. We evaluate our findings on an independent dataset and on reference shop sites.
Keywords :
Internet; classification; electronic commerce; hypermedia markup languages; retail data processing; search engines; F1-score; HTML page; HTML tag attributes; Web shop pages; aggregated data; automated detection; classification algorithms; e-commerce systems detection; search engines; shop extensions; shop sites; sparse features; structured data; supervised classification; system-specific extractors; Algorithm design and analysis; Business; HTML; Radio frequency; Support vector machines; Training; Web pages; e-commerce systems; supervised machine learning; web page classification;
Conference_Titel :
e-Business Engineering (ICEBE), 2013 IEEE 10th International Conference on
Conference_Location :
Coventry
DOI :
10.1109/ICEBE.2013.30