DocumentCode :
2253940
Title :
Data abstraction through density estimation by storage management
Author :
Meier, Kathrin Anne
Author_Institution :
Inst. of Sci. Comput., Swiss Federal Inst. of Technol., Zurich, Switzerland
fYear :
1997
fDate :
11-13 Aug 1997
Firstpage :
39
Lastpage :
50
Abstract :
One way to cope with the constantly growing amount of scientific data to be analyzed is to derive data abstractions from the original data. Data abstractions can provide a representation of the data in compressed form where the data´s semantic structure is maintained. The author has explored data abstractions based on density estimation. The method to estimate the density of scientific data sets is based on the directory of a multidimensional data access structure. This data density estimator is called directory estimator. It is based on multidimensional adaptive histograms and is therefore computationally efficient, even for large data sets and many dimensions. The paper describes the methodology in general and focuses on the estimator´s accuracy in particular. The accuracy of the directory estimator depends on the parameters of the access structures used, such as the bucket capacity. She evaluates the choice of bucket capacity theoretically as well as empirically with the ISE (integrated squared error) being the measure of error and using a grid file as the data access structure. A useful application of the directory estimator in the field of scientific data is presented with a practical example from astronomy
Keywords :
astronomy computing; data structures; natural sciences; scientific information systems; storage management; very large databases; access structures; astronomy; bucket capacity; compressed data; data abstraction; density estimation; directory estimator; grid file; integrated squared error; multidimensional adaptive histograms; multidimensional data access structure directory; scientific data sets; semantic structure; storage management; Astronomy; Data analysis; Data visualization; Extraterrestrial measurements; Histograms; Multidimensional systems; Scientific computing; Statistical analysis; Technology management;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Scientific and Statistical Database Management, 1997. Proceedings., Ninth International Conference on
Conference_Location :
Olympia, WA
Print_ISBN :
0-8186-7952-2
Type :
conf
DOI :
10.1109/SSDM.1997.621149
Filename :
621149
Link To Document :
بازگشت