DocumentCode :
3036374
Title :
Knowledge discovery in databases: applications in the electrical power engineering domain
Author :
Steele, J.A. ; McDonald, J.R. ; Arcy, C.D.
Author_Institution :
Strathclyde Univ., Glasgow, UK
fYear :
1997
fDate :
35767
Firstpage :
42583
Lastpage :
42586
Abstract :
Knowledge discovery in databases (KDD) is defined as the non trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data (W.J. Frawley et al., 1991). KDD is an iterative process involving five steps which lead to the final goal of useful information. The five steps are: selection of data-determining which fields and records are to be analysed; preprocessing-cleaning the data, by removal of noise and outliers, if appropriate, and deciding on strategies for missing attribute values; transformation-representing the data by new features, and reducing its dimensionality; data mining-deciding which algorithms to apply to the data i.e., classification, regression, rule induction, neural networks; and interpretation/evaluation-feasibility analysis of the results from the data mining step. There are two general `goals´ in KDD: verification of a hypothesis; and discovery, where the `system´ autonomously discovers patterns. Within the KDD process a data warehouse is typically employed as the `source´ of the KDD exercise. The power industry has evolved to become dependent upon computerised environments with more online data being stored for later extraction and investigation. Two key areas where KDD has been shown to be applicable is in the analysis of energy pooling and settlement data, and for condition monitoring of power system plant
Keywords :
knowledge acquisition; KDD; computerised environments; condition monitoring; data mining; data warehouse; electrical power engineering domain; energy pooling; feasibility analysis; hypothesis verification; iterative process; knowledge discovery in databases; missing attribute values; online data storage; power industry; power system plant; preprocessing; rule induction; settlement data; understandable patterns;
fLanguage :
English
Publisher :
iet
Conference_Titel :
IT Strategies for Information Overload (Digest No: 1997/340), IEE Colloquium on
Conference_Location :
London
Type :
conf
DOI :
10.1049/ic:19971153
Filename :
659910
Link To Document :
بازگشت