• DocumentCode
    2778992
  • Title

    AutoClustering: An estimation of distribution algorithm for the automatic generation of clustering algorithms

  • Author

    Meiguins, Aruanda S G ; Limão, Roberto C. ; Meiguins, Bianchi S. ; Junior, Samuel F S ; Freitas, Alex A.

  • Author_Institution
    PPGEE, UFPA, Belém, Brazil
  • fYear
    2012
  • fDate
    10-15 June 2012
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    Most of the existing Data Mining algorithms have been manually produced, that is, have been developed by a human programmer. A prominent Artificial Intelligence research area is automatic programming - the generation of a computer program by another computer program. Clustering is an important data mining task with many useful real-world applications. Particularly, the class of clustering algorithms based on the idea of data density to identify clusters has many advantages, such as the ability to identify arbitrary-shape clusters. We propose the use of Estimation of Distribution Algorithms for the artificial generation of density-based clustering algorithms. In order to guarantee the generation of valid algorithms, a directed acyclic graph (DAG) was defined where each node represents a procedure (building block) and each edge represents a possible execution sequence between two nodes. The Building Blocks DAG specifies the alphabet of the EDA, that is, any possibly generated algorithm. Preliminary experimental results compare the clustering algorithms artificially generated by AutoClustering to DBSCAN, a well-known manually-designed algorithm.
  • Keywords
    data mining; directed graphs; pattern clustering; AutoClustering; DAG; DBSCAN; EDA; arbitrary-shape clusters; artificial intelligence research area; automatic clustering algorithm generation; automatic programming; computer program; data mining algorithms; density-based clustering algorithms; directed acyclic graph; distribution algorithm; human programmer; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Data mining; Estimation; Manuals; Training; Automatic Programming; Data Mining; Density-Based Clustering; Estimation of Distribution Algorithms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Evolutionary Computation (CEC), 2012 IEEE Congress on
  • Conference_Location
    Brisbane, QLD
  • Print_ISBN
    978-1-4673-1510-4
  • Electronic_ISBN
    978-1-4673-1508-1
  • Type

    conf

  • DOI
    10.1109/CEC.2012.6252874
  • Filename
    6252874