Title :
Extraction techniques for mining services from Web sources
Author :
Davulcu, Hasan ; Mukherjee, Saikat ; Ramakrishnan, I.V.
Author_Institution :
Dept. of Comput. Sci., State Univ. of New York, Stony Brook, NY, USA
Abstract :
The Web has established itself as the dominant medium for doing electronic commerce. Consequently the number of service providers, both large and small, advertising their services on the web continues to proliferate. In this paper we describe new extraction algorithms for mining service directories from web pages. We develop a novel propagation technique for identifying and accumulating all of the attributes related to a service entity in a web page. We provide experimental results of the effectiveness of our extraction techniques by mining a database of veterinarian service providers from web sources.
Keywords :
data mining; electronic commerce; learning (artificial intelligence); electronic commerce; extraction algorithms; mining service directories; web pages; web sites; Advertising; Cities and towns; Computer science; Databases; Electronic commerce; Ontologies; Taxonomy; Web pages;
Conference_Titel :
Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on
Print_ISBN :
0-7695-1754-4
DOI :
10.1109/ICDM.2002.1184008