DocumentCode :
774320
Title :
Sampling strategies for mining in data-scarce domains
Author :
Ramakrishnan, N. ; Bailey-Kellogg, Chris
Author_Institution :
Virginia Tech, VA, USA
Volume :
4
Issue :
4
fYear :
2002
Firstpage :
31
Lastpage :
43
Abstract :
A novel framework leverages physical properties for mining in data-scarce domains. It interleaves bottom-up data mining with top-down data collection, leading to effective and explainable sampling strategies. This article describes focused sampling strategies for mining scientific data. Our approach is based on the spatial aggregation language, which supports construction of data interpretation and control design applications for spatially distributed physical systems in a bottom-up manner. Used as a basis for describing data mining algorithms, SAL programs also help exploit knowledge of physical properties such as continuity and locality in data fields. We also introduce a top-down sampling strategy that focuses data collection in only those regions that are deemed most important to support a data mining objective.
Keywords :
data acquisition; data mining; eigenvalues and eigenfunctions; natural sciences computing; optimisation; sampling methods; data acquisition; data collection; data mining; data-scarce domains; eigenvalues; optimization; scientific data; spatial aggregation language; top-down sampling; Aerodynamics; Analytical models; Computational modeling; Data engineering; Data mining; Design engineering; Distributed computing; Process design; Propulsion; Sampling methods;
fLanguage :
English
Journal_Title :
Computing in Science & Engineering
Publisher :
ieee
ISSN :
1521-9615
Type :
jour
DOI :
10.1109/MCISE.2002.1014978
Filename :
1014978
Link To Document :
بازگشت