Title :
Mining and knowledge discovery from the Web
Author :
McCurley, Kevin S. ; Tomkins, Andrew
Author_Institution :
IBM Almaden Res. Center, San Jose, CA, USA
Abstract :
The World Wide Web presents an interesting opportunity for data mining and knowledge discovery, and this area is growing rapidly as both a research topic and a business activity. In this survey we describe some of the problems that are addressed, and elements of the WebFountain infrastructure that we have built for addressing them. Our focus here is on describing some of the lessons learned and some broad research areas that are involved.
Keywords :
Internet; data mining; Web data; Web pages; WebFountain infrastructure; World Wide Web; data mining; information extraction; knowledge discovery; Business; Data mining; Databases; Government; Information retrieval; Search engines; Service oriented architecture; Tagging; Web pages; Web sites;
Conference_Titel :
Parallel Architectures, Algorithms and Networks, 2004. Proceedings. 7th International Symposium on
Print_ISBN :
0-7695-2135-5
DOI :
10.1109/ISPAN.2004.1300449