Title :
Sampling issues in generating rules from databases
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of Connecticut, Storrs, CT, USA
Abstract :
There have been a number of studies concerning inductive rule generation from databases. All inductive rules are based on instances of the databases, and such instances can be regarded as the sample of the population in the real world. Therefore, the validity, unbiasedness and correctness of these instances cannot be overemphasized in the rule induction environment. While many researchers have focused on the methods of generating rules from databases, the author discusses some sampling issues that occur in rule generation from databases. The author tries to bridge the gap between sampling in statistics and rule generation in databases. Two sampling problems-small sample size and biased sample-which occur mostly in rule induction were studied. The author investigates how these problems affect the validity of rule induction and provides a set of criteria for a rule induction system to generate feasible rules
Keywords :
database theory; deductive databases; inference mechanisms; biased sample; correctness; database instances; feasible rules; rule generation; rule induction system; small sample size; statistics; unbiasedness; validity; Artificial intelligence; Bridges; Computer science; Data engineering; Databases; Diseases; Induction generators; Medical diagnostic imaging; Sampling methods; Statistics;
Conference_Titel :
Tools with Artificial Intelligence, 1993. TAI '93. Proceedings., Fifth International Conference on
Conference_Location :
Boston, MA
Print_ISBN :
0-8186-4200-9
DOI :
10.1109/TAI.1993.633992