Title :
Validity of Probabilistic Rules
Author :
Sapir, Marina ; Teverovskiy, Mikhail
Author_Institution :
Aureon Labs., Yonkers, NY
fDate :
March 1 2007-April 5 2007
Abstract :
We propose an axiomatic approach to defining of the validity of probabilistic inductive rules E rArr H. The set of rules is evaluated against an available dataset, where the conditions E, H are either true or false for each instance in the dataset. Introduced here are six axioms which formalize common sense dependencies between the validity of rules and their support, confidence, lift and amount of available evidence. Having a single validity measure, contrary to multiple criteria, helps compare and rank induced rules. We demonstrate that the z-test of difference of proportions satisfies all the axioms and can be used as a measure of rules validity. Knowing that the z-test statistics is normally distributed, allows one to filter out statistically unreliable rules. We demonstrate advantages of the proposed approach on a real life medical dataset
Keywords :
data mining; probability; statistical testing; common sense dependencies; medical dataset; probabilistic inductive rules; validity measure; z-test statistics; Association rules; Computational intelligence; Data analysis; Data mining; Filters; Production; Statistical distributions; USA Councils;
Conference_Titel :
Computational Intelligence and Data Mining, 2007. CIDM 2007. IEEE Symposium on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0705-2
DOI :
10.1109/CIDM.2007.368845