Title of article :
Examination and comparison of conflicting data in granulated datasets: Equal width interval vs. equal frequency interval
Author/Authors :
ChienHsing Wu، نويسنده , , Shu-Chen Kao، نويسنده , , Koji Okuhara، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2013
Pages :
11
From page :
154
To page :
164
Abstract :
Knowledge discovery from databases requires comprehensive pre-examination to ensure that granulated datasets are consistent for continuous database conversion. Different granulation techniques may produce different results in the number of conflicting data in a granulated dataset. This work examines and compares the performance of equal width interval (EWI) and equal frequency interval (EFI), two granulation techniques. This work also explores the relationship between granulation performance and dataset size, number of attributes, and number of classes. Eighteen continuous datasets are examined. Experimental results indicate that (1) of the 18 datasets examined, 7 contained conflicting data by EWI and 8 by EFI, suggesting that almost 40% of the granulated datasets contained conflicting data; (2) almost 22% of the datasets had more than 20% conflicting data; (3) comparatively, no notable difference existed between EWI and EFI with respect to their granulation performance; (4) the production of conflicting data by EWI and EFI when compared against dataset size and number of classes was not remarkably different; and (5) more than 12 attributes will reduce the number of conflicting data by both EWI and EFI.
Keywords :
Granulation , Conflicting data , DATA MINING
Journal title :
Information Sciences
Serial Year :
2013
Journal title :
Information Sciences
Record number :
1215681
Link To Document :
بازگشت