• DocumentCode
    2222306
  • Title

    An effective support vector machines (SVMs) performance using hierarchical clustering

  • Author

    Awad, Mamoun ; Khan, Latifur ; Bastani, Farokh ; Yen, I-Ling

  • Author_Institution
    Dept. of Comput. Sci., Texas Univ., Dallas, TX, USA
  • fYear
    2004
  • fDate
    15-17 Nov. 2004
  • Firstpage
    663
  • Lastpage
    667
  • Abstract
    The training time for SVMs to compute the maximal marginal hyper-plane is at least O(N2) with the data set size N, which makes it nonfavorable for large data sets. This work presents a study for enhancing the training time of SVMs, specifically when dealing with large data sets, using hierarchical clustering analysis. We use the dynamically growing self-organizing tree (DGSOT) algorithm for clustering because it has proved to overcome the drawbacks of traditional hierarchical clustering algorithms. Clustering analysis helps find the boundary points, which are the most qualified data points to train SVMs, between two classes. We present a new approach of combination of SVMs and DGSOT, which starts with an initial training set and expands it gradually using the clustering structure produced by the DGSOT algorithm. We compare our approach with the Rocchio Bundling technique in terms of accuracy loss and training time gain using two benchmark real data sets.
  • Keywords
    computational complexity; data mining; learning (artificial intelligence); pattern clustering; self-organising feature maps; support vector machines; SVM; data sets; dynamically growing self-organizing tree algorithm; hierarchical clustering; maximal marginal hyper-plane; neural net training time; support vector machines; Bagging; Buildings; Clustering algorithms; Computer science; Data mining; Kernel; Partitioning algorithms; Support vector machine classification; Support vector machines; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Tools with Artificial Intelligence, 2004. ICTAI 2004. 16th IEEE International Conference on
  • ISSN
    1082-3409
  • Print_ISBN
    0-7695-2236-X
  • Type

    conf

  • DOI
    10.1109/ICTAI.2004.26
  • Filename
    1374251