Title :
An Information Measure for Comparing Top k Lists
Author :
Collier, James H. ; Konagurthu, Arun S.
Author_Institution :
Clayton Sch. of Inf. Technol., Monash Univ., Clayton, VIC, Australia
Abstract :
Comparing the top k elements between two or more ranked results is a common task in many contexts and settings. A few measures have been proposed to compare top k lists with attractive mathematical properties, but they face a number of pitfalls and shortcomings in practice. This work introduces a new measure to compare any two top k lists based on measuring the information these lists convey. Our method investigates the compressibility of the lists, and the length of the message to encode losslessly the lists gives a natural and robust measure of their variability. This information-theoretic measure objectively reconciles all the main considerations that arise when measuring (dis-)similarity between lists: the extent of their non-overlapping elements, the amount of disarray among overlapping elements, the measurement of displacement of actual ranks (positions) of their overlapping elements. We demonstrate that our measure is intuitively simple and superior to other commonly used measures. To the best of our knowledge, this is the first attempt to address the problem using information compression as its basis.
Keywords :
information theory; compressibility; information compression; information measure; information-theoretic measure; mathematical property; nonoverlapping element; top k elements; top k lists; Displacement measurement; Encoding; Joints; Loss measurement; Position measurement; Radiation detectors;
Conference_Titel :
e-Science (e-Science), 2014 IEEE 10th International Conference on
Conference_Location :
Sao Paulo
Print_ISBN :
978-1-4799-4288-6
DOI :
10.1109/eScience.2014.39