Title :
Mining Semantic Time Period Similarity in Spatio-Temporal Climate Data
Author :
McGuire, Michael P. ; Ziying Tang
Author_Institution :
Comput. & Inf. Sci. Dept., Towson Univ., Towson, MD, USA
Abstract :
Over the last decade, advances in high performance computing and remote sensing have produced a vast amount of spatio-temporal data. One area that this data explosion is most prevalent is climate science. With this in mind, there is an increasing need to characterize large spatio-temporal datasets. One such characterization is to find periods of time that exhibit the same spatio-temporal pattern. The focus of this research is to find similar spatio-temporal patterns for semantic time periods. A semantic time period could be any arbitrary division in time such as year, month, or week. The proposed approach first characterizes the data spatially by using one of three approaches including local entropy, local spatial autocorrelation, and local distance-based outliers, to identify interesting spatial features in the dataset. Then, a location/time period matrix which is analogous to a term/document matrix in natural language processing is created to capture the spatial features for a given semantic time period. This matrix contains a count of for each spatial location, the number of times that it is a feature of interest during a semantic time period. Then using latent semantic analysis, the cosine similarity for each semantic time period is calculated. The results are then clustered using affinity propagation. The results show that the similarity matrix produced by distance-based outliers creates the best clustering. The approach is demonstrated on a modeled global climate dataset where we clustered years from 1948 to 2012.
Keywords :
climatology; data analysis; data mining; geophysics computing; matrix algebra; natural language processing; pattern clustering; affinity propagation; cosine similarity; data explosion; document matrix; global climate dataset; high performance computing; large datasets; latent semantic analysis; local distance-based outliers; local entropy; local spatial autocorrelation; location period matrix; natural language processing; remote sensing; semantic time period similarity mining; spatiotemporal climate data; term matrix; time period matrix; Correlation; Data mining; Entropy; Meteorology; Semantics; Temperature measurement; Tensile stress; climate data; latent semantic analysis; semantic time period; similarity; spatio-temporal data mining;
Conference_Titel :
Data Mining Workshops (ICDMW), 2013 IEEE 13th International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4799-3143-9
DOI :
10.1109/ICDMW.2013.94