Title :
Extracting spatial knowledge from the web
Author :
Morimoto, Yasuhiko ; Aono, Masaki ; Houle, Michael E. ; McCurley, Kevin S.
Author_Institution :
Tokyo Res. Lab., IBM Japan Ltd., Japan
Abstract :
The content of the World-Wide Web is pervaded by information of a geographical or spatial nature, particularly location information such as addresses, postal codes, and telephone numbers. We present a system for extracting spatial knowledge from collections of Web pages gathered by Web-crawling programs. For each page determined to contain location information, we apply geocoding techniques to compute geographic coordinates, such as latitude-longitude pairs. Next, we augment the location information with keyword descriptors extracted from Web page contents. We then apply spatial data mining techniques on the augmented location information to derive spatial knowledge.
Keywords :
Web sites; data mining; information retrieval; search engines; Web pages; Web-crawling programs; World-Wide Web; addresses; geocoding techniques; geographic coordinates; latitude-longitude pairs; location information; postal codes; spatial data mining techniques; spatial knowledge extraction; telephone numbers; Companies; Data mining; Internet; Laboratories; Marketing and sales; Spatial databases; Telephony; Web pages;
Conference_Titel :
Applications and the Internet, 2003. Proceedings. 2003 Symposium on
Print_ISBN :
0-7695-1872-9
DOI :
10.1109/SAINT.2003.1183066