DocumentCode :
3310617
Title :
2D TSA-tree: a wavelet-based approach to improve the efficiency of multi-level spatial data mining
Author :
Shahabi, Cyrus ; Chung, Seokkyung ; Safar, Maytham ; Hajj, George
Author_Institution :
Dept. of Comput. Sci., Univ. of Southern California, Los Angeles, CA, USA
fYear :
2001
fDate :
2001
Firstpage :
59
Lastpage :
68
Abstract :
Due to the large amount of the collected scientific data, it is becoming increasingly difficult for scientists to comprehend and interpret the available data. Moreover typical queries on these data sets are in the nature of identifying (or visualizing) trends and surprises at a selected sub-region in multiple levels of abstraction rather than identifying information about a specific data point. The authors propose a versatile wavelet-based data structure, 2D TSA-tree (Trend and Surprise Abstractions Tree), to enable efficient multi-level trend detection on spatial data at different levels. We show how 2D TSA-tree can be utilized efficiently for sub-region selections. Moreover, 2D TSA-tree can be utilized to precompute the reconstruction error and retrieval time of a data subset in advance in order to allow the user to trade off accuracy for response time (or vice versa) at query time. Finally, when the storage space is limited, our 2D Optimal TSA-tree saves on storage by storing only a specific optimal subset of the tree. To demonstrate the effectiveness of our proposed methods, we evaluated our 2D TSA-tree using real and synthetic data. Our results show that our method outperformed other methods (DFT and SVD) in terms of accuracy, complexity and scalability
Keywords :
data mining; query processing; scientific information systems; tree data structures; visual databases; wavelet transforms; 2D Optimal TSA-tree; 2D TSA-tree; DFT; SVD; Trend and Surprise Abstractions Tree; data point; data sets; data subset; multi-level spatial data mining; multi-level trend detection; multiple levels of abstraction; optimal subset; query time; reconstruction error; response time; retrieval time; scientific data; spatial data; storage space; sub-region; sub-region selections; synthetic data; wavelet-based approach; wavelet-based data structure; Computer science; Data mining; Data visualization; Delay; Earth; Global Positioning System; Information retrieval; Laboratories; Ocean temperature; Propulsion;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Scientific and Statistical Database Management, 2001. SSDBM 2001. Proceedings. Thirteenth International Conference on
Conference_Location :
Fairfax, VA
ISSN :
1099-3371
Print_ISBN :
0-7695-1218-6
Type :
conf
DOI :
10.1109/SSDM.2001.938538
Filename :
938538
Link To Document :
بازگشت