Title :
Accuracy of joint entropy and mutual information estimates
Author :
Bazsó, F. ; Zalányi, L. ; Petróczi, A.
Author_Institution :
KFKI Res. Inst. for Particle & Nucl. Phys., Hungarian Acad. of Sci., Budapest, Hungary
Abstract :
In practice, researchers often face the problem of being able to collect only one, possibly large, dataset, and they are forced to make inferences from a single sample. Based on the results of the polarisation operator technique of Bowman et al (1969), we computed the dependence of joint entropy and mutual information estimates on the sample size in terms of asymptotic series. These expressions enabled us to control the bias of the estimates caused by finite sample sizes and obtain an expression for the accuracies. The result is important in data mining when joint entropy and mutual information are used to find interdependences within large data sets with unknown underlying structures.
Keywords :
covariance analysis; data mining; data structures; entropy; asymptotic series; data mining; finite sample sizes; joint entropy; mutual information estimates; Data mining; Entropy; Frequency estimation; Image coding; Mutual information; Nuclear physics; Polarization; Probability; Psychology; Size control;
Conference_Titel :
Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on
Print_ISBN :
0-7803-8359-1
DOI :
10.1109/IJCNN.2004.1381108