DocumentCode :
3222675
Title :
Rule mining and classification in imperfect databases
Author :
Hewawasam, K.K.R.G.K. ; Premaratne, K. ; Subasingha, S.P. ; Shyu, M.-L.
Author_Institution :
Dept. of Electr. & Comput. Eng., Miami Univ., Coral Gables, FL, USA
Volume :
1
fYear :
2005
fDate :
25-28 July 2005
Abstract :
A rule-based classifier learns rules from a set of training data instances with assigned class labels and then uses those rules to assign a class label for a new incoming data instance. To accommodate data imperfections, a probabilistic relational model would represent the attributes by probabilistic functions. One extension to this model uses belief functions instead. Such an approach can represent a wider range of data imperfections. However, the task of extracting frequent patterns and rules from such a "belief theoretic" relational database has to overcome a potentially enormous computational burden. In this work, we present a data structure that is an alternate representation of a belief theoretic relational database. We then develop efficient algorithms to query for belief of item sets, extract frequent item sets and generate corresponding association rules from this representation. This set of rules is then used as the basis on which an unknown data instance, whose attributes are represented via belief functions, is classified. These algorithms are tested on a data set collected from a test bed that mimics airport threat detection and classification scenario where both data attributes and threat class labels may possess imperfections.
Keywords :
belief networks; data mining; inference mechanisms; relational databases; security of data; tree data structures; uncertainty handling; Dempster-Shafer theory; airport threat detection; association rule generation; belief function; class label assignment; data structure; frequent item set extraction; imperfect database classification; probabilistic relational model; rule mining; rule-based classifier; training data; Airports; Application software; Association rules; Data mining; Data structures; Interpolation; Itemsets; Relational databases; Testing; Training data; Data imperfections; Dempster-Shafer belief theory; association rules; classification; data ambiguities; data mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Fusion, 2005 8th International Conference on
Print_ISBN :
0-7803-9286-8
Type :
conf
DOI :
10.1109/ICIF.2005.1591917
Filename :
1591917
Link To Document :
بازگشت