Title :
Training set selection using entropy based distance
Author :
Kajdanowicz, Tomasz ; Plamowski, Slawomir ; Kazienko, Przemyslaw
Author_Institution :
Inst. of Infomratics, Wroclaw Univ. of Technol., Wroclaw, Poland
Abstract :
Distance measures, especially between probability density functions, are essential in solving machine learning problems. Among classification and clustering, data reduction and selection are some of them. In the paper a new distance measure for comparing and selecting training datasets is described. The distance between two datasets is based on variance of entropy in groups obtained by clustering joint datasets. The proposed approach is examined in dataset selection during prediction of debt portfolio value. Finally, basic evaluation on prediction performance is conducted.
Keywords :
data reduction; entropy; investment; learning (artificial intelligence); pattern classification; pattern clustering; probability; data reduction; data selection; debt portfolio value; distance measures; entropy based distance; joint dataset clustering; machine learning problems; probability density functions; training set selection; Entropy; Portfolios; Prediction algorithms; Probability density function; Testing; Training; Vectors; dataset selection; debt valuation; distance measures; intelligent systems; prediction methods; supervised learning;
Conference_Titel :
Applied Electrical Engineering and Computing Technologies (AEECT), 2011 IEEE Jordan Conference on
Conference_Location :
Amman
Print_ISBN :
978-1-4577-1083-4
DOI :
10.1109/AEECT.2011.6132530