DocumentCode :
539338
Title :
A taxonomy-based classification model by using abstraction and aggregation
Author :
Tang, Alice ; Fong, Simon
Author_Institution :
Fac. of Sci. & Technol., Univ. of Macau, Macau, China
fYear :
2010
fDate :
Nov. 30 2010-Dec. 2 2010
Firstpage :
448
Lastpage :
454
Abstract :
Data preprocessing is an important data manipulation process prior to mining actions. Various techniques that include feature selection and data transformation have been studied in the past, with the aim of producing a compact and efficient decision tree. They all have their respective strengths, but in general they commonly lack of preserving the meanings of the attributes. The concept of Attribute Value Taxonomies (AVT) that is a value set of a particular attribute which is specified at different levels of precision and can be represented as a tree-structure was originally proposed by Honavar in year 2003. AVT has the advantages of naturally and easily understanding the attributes in a hierarchy of resolutions. In this paper, we extend the concept of AVT into the domain of data preprocessing for building decision trees based on attributes that are abstracted in different levels. The result is a series of decision trees with each specifically built pertaining to an abstract level of concept. A visualization tool is also programmed that shows both the significances of the attributes and the predictive powers in each tree. A live dataset of e-Bay transactions was used as a case study. The experimental results indicate that by applying appropriate abstraction and aggregation techniques, the decision tree can be made simpler, and accuracy can be improved. The resultant trees can be mapped across to AVT for easy interpretation by human.
Keywords :
abstracting; classification; data mining; decision trees; tree data structures; Honavar; abstraction; aggregation; attribute value taxonomies; data manipulation process; data mining; data preprocessing; data transformation; decision tree; e-Bay transactions; feature selection; taxonomy-based classification model; tree-structure; visualization tool; Accuracy; Classification algorithms; Data mining; Data models; Decision trees; Marketing and sales; Taxonomy; Attribute Value Taxonomies; Data Mining; Decision Tree; Feature selection; Preprocessing; Visual Mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Information Management and Service (IMS), 2010 6th International Conference on
Conference_Location :
Seoul
Print_ISBN :
978-1-4244-8599-4
Electronic_ISBN :
978-89-88678-32-9
Type :
conf
Filename :
5713492
Link To Document :
بازگشت