Title :
Web content acquisition in web content aggregation service based on digital earth geospatial framework
Author :
Xu, Yunfei ; Weng, Jingnong ; Sharma, Ananta Raj ; Yussupov, Dilshod
Author_Institution :
Sch. of Comput. Sci. & Technol., Beihang Univ., Beijing, China
Abstract :
The Web Content Aggregation Service based on Digital Earth geospatial framework is combined form of existing web content aggregation service and corresponding space position information retrieval. The customized service comprises three stages: web content acquisition, aggregation and distribution stage. This paper focuses on the first stage where the required content is fetched from multiple web sites and location information is extracted from the fetched contents. The first stage provides the location information to be used in the next two stages. The aggregated result is exhibited on the digital earth. Web content acquisition method based on HTML DOM tree structure is proposed in the paper and from practical point of view this method is accurate and efficient especially for the web content with spatial data.
Keywords :
Web services; Web sites; data acquisition; geographic information systems; hypermedia markup languages; information retrieval; tree data structures; HTML DOM tree structure; Web content acquisition; Web content aggregation service; Web sites; customized service; digital earth geospatial framework; distribution stage; location information; space position information retrieval; Browsers; Data mining; Earth; Engines; Feature extraction; Geospatial analysis; Internet; DOM tree; digital earth; web content aggregation; web scrapper;
Conference_Titel :
Geoinformatics, 2011 19th International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-61284-849-5
DOI :
10.1109/GeoInformatics.2011.5980937