• DocumentCode
    2832362
  • Title

    A Hybrid Evolutionary Approach To Construct Optimal Decision Trees With Large Data Sets

  • Author

    Patil, D.V. ; Bichkar, R.S.

  • Author_Institution
    S.G.G.S. Inst. of Eng. & Tech. Nanded.M.S, Maharashtra
  • fYear
    2006
  • fDate
    15-17 Dec. 2006
  • Firstpage
    429
  • Lastpage
    433
  • Abstract
    Data mining environments produces large volume of data. The large amount of knowledge contains can be utilized to improve decision-making process of an organization. Large amount of available data when used for decision tree construction builds large sized trees that are incomprehensible to human experts. The learning process on this high volume data becomes very slow, as it has to be done serially on available large datasets. Our ultimate goal is to build smaller trees with equally accurate solutions with randomly selected sampled data. We experimented on techniques based on the idea of incremental random sampling combined with genetic algorithms that uses global search techniques to evolve decision Trees to obtain compact representation of large data set. Experiments performed on some data sets proved that the proposed random sampling procedures with genetic algorithms to build decision Trees gives relatively smaller trees as compared to other methods but equally accurate solution as other methods. The method incorporates optimization with the comprehensibility and scalability. We tried to explore the method using that we can avoid problems like slow execution, overloading of memory and processor with very large database can be avoided using the technique.
  • Keywords
    data mining; decision making; decision trees; genetic algorithms; data mining; decision making; genetic algorithms; global search techniques; hybrid evolutionary approach; incremental random sampling; large data sets; optimal decision trees; optimization; Biological cells; Classification tree analysis; Data mining; Decision making; Decision trees; Genetic algorithms; Humans; Optimization methods; Sampling methods; Testing; Comprehensibility; Large data sets; classification accuracy; decision tree; genetic algorithm; genetically evolved decision Tree; training set size;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Industrial Technology, 2006. ICIT 2006. IEEE International Conference on
  • Conference_Location
    Mumbai
  • Print_ISBN
    1-4244-0726-5
  • Electronic_ISBN
    1-4244-0726-5
  • Type

    conf

  • DOI
    10.1109/ICIT.2006.372250
  • Filename
    4237572