• DocumentCode
    1279180
  • Title

    An efficient inductive learning method for object-oriented database using attribute entropy

  • Author

    Huang, Yueh-Min ; Lin, Shian-Hua

  • Author_Institution
    Dept. of Eng. Sci., Nat. Cheng Kung Univ., Tainan, Taiwan
  • Volume
    8
  • Issue
    6
  • fYear
    1996
  • fDate
    12/1/1996 12:00:00 AM
  • Firstpage
    946
  • Lastpage
    951
  • Abstract
    The data-driven characteristic of the Version Space rule-learning method works efficiently in memory even if the training set is enormous. However, the concept hierarchy of each attribute used to generalize/specialize the hypothesis of a specific/general (S/G) set is processed sequentially and instance by instance, which degrades its performance. As for ID3, the decision tree is generated from the order of attributes according to their entropies to reduce the number of attributes in some of the tree paths. Unlike Version Space, ID3 generates an extremely complex decision tree when the training set is enormous. Therefore, we propose a method called AGE (A_RCH+OG_L+ASE_, where ARCH=“Automatic geneRation of Concept Hierarchies”, OGL=“Optimal Generalization Level”, and ASE=“Attribute Selection by Entropy”), taking advantages of Version Space and ID3 to learn rules from object-oriented databases (OODBs) with the least number of learning features according to the entropy. By simulations, we found the performance of our learning algorithm is better than both Version Space and ID3. Furthermore, AGE´s time complexity and space complexity are both linear with the number of training instances
  • Keywords
    computational complexity; database theory; deductive databases; entropy; generalisation (artificial intelligence); learning by example; object-oriented databases; software performance evaluation; tree data structures; AGE method; ARCH; ASE; ID3 algorithm; OGL; Version Space rule-learning method; attribute entropy; attribute order; attribute selection; concept hierarchy; data-driven characteristics; decision tree; hypothesis generalization; hypothesis specialization; inductive learning method; learning features; object-oriented database; optimal generalization level; performance; performance degradation; sequential instance-by-instance processing; space complexity; time complexity; training set size; tree paths; Costs; Decision trees; Degradation; Entropy; Expert systems; Knowledge based systems; Learning systems; Object oriented databases; Object oriented modeling; Spatial databases;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/69.553161
  • Filename
    553161