Title :
Data mining and knowledge discovery in databases: implications for scientific databases
Author_Institution :
Microsoft Corp., Redmond, WA, USA
Abstract :
Data mining and knowledge discovery in databases (KDD) promise to play an important role in the way people interact with databases, especially scientific databases where analysis and exploration operations are essential. The author defines the basic notions in data mining and KDD, defines the goals, presents motivation, and gives a high-level definition of the KDD process and how it relates to data mining. The author then focuses on data mining methods. Basic coverage of a sampling of methods is provided to illustrate the methods and how they are used. The author covers a case study of a successful application in science data analysis: the classification of cataloging of a major astronomy sky survey covering 2 billion objects in the northern sky. The system can outperform human as well as classical computational analysis tools in astronomy on the task of recognizing faint stars and galaxies. The author also covers the problem of scaling a clustering problem to a large catalog database of billions of objects
Keywords :
astronomy computing; cataloguing; classification; data analysis; scientific information systems; very large databases; analysis operations; astronomy sky survey; cataloguing classification; clustering problem scaling; data mining; databases; exploration operations; faint galaxy recognition; faint star recognition; knowledge discovery; large catalog database; northern sky; science data analysis; scientific databases; Aggregates; Astronomy; Data analysis; Data mining; Data visualization; Humans; Pattern recognition; Sampling methods; Statistics; Visual databases;
Conference_Titel :
Scientific and Statistical Database Management, 1997. Proceedings., Ninth International Conference on
Conference_Location :
Olympia, WA
Print_ISBN :
0-8186-7952-2
DOI :
10.1109/SSDM.1997.621141