Title :
Contextualisation of Geographical Scraped Data to Support Human Judgment and Classification
Author :
Mazzola, Luca ; Tsois, Aris ; Dimitrova, T. ; Camossi, Elena
Author_Institution :
Joint Res. Center (JRC), Eur. Comm., Varese, Italy
Abstract :
When dealing with information extraction or data mining for security, one of the prerequisite is the data cleaning process, a process that influence deeply the final result. This is particularly true in case of data scraped automatically from online sources (web pages) that contain geographical or geo-referenced information. In this paper we present a model, and a first partial implementation, for location resolution of string descriptions. The domain is the monitoring and analysis of maritime container traffic, relying on the status messages generated by container carriers. The model is based on the usage of three different data dimensions: string similarity, trajectories similarity and most frequent patterns. The realized interface, through a map-based view, provide an integration of the three dimensions. This functionality supports human experts in associating a location to the string description provided in the raw record, in order to improve the numbers of messages usable for route-based analysis.
Keywords :
data mining; geographic information systems; pattern classification; public administration; sea ports; security of data; classification; container carriers; data cleaning process; data mining; geographical information; geographical scraped data contextualisation; georeferenced information; human judgment; information extraction; location resolution; map-based view; maritime container traffic; online sources; route-based analysis; security; status messages; string descriptions; string similarity; trajectories similarity; Containers; Estimation; Europe; Geospatial analysis; Lenses; Semantics; Trajectory;
Conference_Titel :
Intelligence and Security Informatics Conference (EISIC), 2013 European
Conference_Location :
Uppsala
DOI :
10.1109/EISIC.2013.33