• DocumentCode
    539338
  • Title

    A taxonomy-based classification model by using abstraction and aggregation

  • Author

    Tang, Alice ; Fong, Simon

  • Author_Institution
    Fac. of Sci. & Technol., Univ. of Macau, Macau, China
  • fYear
    2010
  • fDate
    Nov. 30 2010-Dec. 2 2010
  • Firstpage
    448
  • Lastpage
    454
  • Abstract
    Data preprocessing is an important data manipulation process prior to mining actions. Various techniques that include feature selection and data transformation have been studied in the past, with the aim of producing a compact and efficient decision tree. They all have their respective strengths, but in general they commonly lack of preserving the meanings of the attributes. The concept of Attribute Value Taxonomies (AVT) that is a value set of a particular attribute which is specified at different levels of precision and can be represented as a tree-structure was originally proposed by Honavar in year 2003. AVT has the advantages of naturally and easily understanding the attributes in a hierarchy of resolutions. In this paper, we extend the concept of AVT into the domain of data preprocessing for building decision trees based on attributes that are abstracted in different levels. The result is a series of decision trees with each specifically built pertaining to an abstract level of concept. A visualization tool is also programmed that shows both the significances of the attributes and the predictive powers in each tree. A live dataset of e-Bay transactions was used as a case study. The experimental results indicate that by applying appropriate abstraction and aggregation techniques, the decision tree can be made simpler, and accuracy can be improved. The resultant trees can be mapped across to AVT for easy interpretation by human.
  • Keywords
    abstracting; classification; data mining; decision trees; tree data structures; Honavar; abstraction; aggregation; attribute value taxonomies; data manipulation process; data mining; data preprocessing; data transformation; decision tree; e-Bay transactions; feature selection; taxonomy-based classification model; tree-structure; visualization tool; Accuracy; Classification algorithms; Data mining; Data models; Decision trees; Marketing and sales; Taxonomy; Attribute Value Taxonomies; Data Mining; Decision Tree; Feature selection; Preprocessing; Visual Mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Information Management and Service (IMS), 2010 6th International Conference on
  • Conference_Location
    Seoul
  • Print_ISBN
    978-1-4244-8599-4
  • Electronic_ISBN
    978-89-88678-32-9
  • Type

    conf

  • Filename
    5713492