DocumentCode :
2369339
Title :
Localized prediction of continuous target variables using hierarchical clustering
Author :
Lazarevic, Aleksandar ; Kanapady, Ramdev ; Kamath, Chandrika ; Kumar, Vipin ; Tamma, Kumar
Author_Institution :
Dept. of Comput. Sci., Minnesota Univ., Minneapolis, MN, USA
fYear :
2003
fDate :
19-22 Nov. 2003
Firstpage :
139
Lastpage :
146
Abstract :
We propose a novel technique for the efficient prediction of multiple continuous target variables from high-dimensional and heterogeneous data sets using a hierarchical clustering approach. The proposed approach consists of three phases applied recursively: partitioning, localization and prediction. In the partitioning step, similar target variables are grouped together by a clustering algorithm. In the localization step, a classification model is used to predict which group of target variables is of particular interest. If the identified group of target variables still contains a large number of target variables, the partitioning and localization steps are repeated recursively and the identified group is further split into subgroups with more similar target variables. When the number of target variables per identified subgroup is sufficiently small, the third step predicts target variables using localized prediction models built from only those data records that correspond to the particular subgroup. Experiments performed on the problem of damage prediction in complex mechanical structures indicate that our proposed hierarchical approach is computationally more efficient and more accurate than straightforward methods of predicting each target variable individually or simultaneously using global prediction models.
Keywords :
distributed databases; learning (artificial intelligence); statistical analysis; very large databases; classification model; clustering algorithm; complex mechanical structures; continuous target variables; heterogeneous data sets; hierarchical clustering; localization step; localized prediction; partitioning step; target variables; Clustering algorithms; Computer science; Data analysis; Laboratories; Large-scale systems; Manufacturing processes; Mechanical engineering; Partitioning algorithms; Predictive models; Scientific computing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2003. ICDM 2003. Third IEEE International Conference on
Print_ISBN :
0-7695-1978-4
Type :
conf
DOI :
10.1109/ICDM.2003.1250913
Filename :
1250913
Link To Document :
بازگشت