• DocumentCode
    2891447
  • Title

    Randomized Sampling for Large Data Applications of SVM

  • Author

    Ferragut, Erik M. ; Laska, J.

  • Author_Institution
    Comput. Sci. & Eng. Div., Oak Ridge Nat. Lab., Oak Ridge, TN, USA
  • Volume
    1
  • fYear
    2012
  • fDate
    12-15 Dec. 2012
  • Firstpage
    350
  • Lastpage
    355
  • Abstract
    A trend in machine learning is the application of existing algorithms to ever-larger datasets. Support Vector Machines (SVM) have been shown to be very effective, but have been difficult to scale to large-data problems. Some approaches have sought to scale SVM training by approximating and parallelizing the underlying quadratic optimization problem. This paper pursues a different approach. Our algorithm, which we call Sampled SVM, uses an existing SVM training algorithm to create a new SVM training algorithm. It uses randomized data sampling to better extend SVMs to large data applications. Experiments on several datasets show that our method is faster than and comparably accurate to both the original SVM algorithm it is based on and the Cascade SVM, the leading data organization approach for SVMs in the literature. Further, we show that our approach is more amenable to parallelization than Cascade SVM.
  • Keywords
    data handling; learning (artificial intelligence); random processes; sampling methods; support vector machines; very large databases; SVM training algorithm; cascade SVM; data organization approach; large data application; machine learning; randomized data sampling; sampled SVM; support vector machine; Approximation algorithms; Data processing; Kernel; Machine learning algorithms; Support vector machines; Training; Vectors; machine learning; parallelization; random sampling; randomized algorithms; scalability; suppor vector machine; svm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications (ICMLA), 2012 11th International Conference on
  • Conference_Location
    Boca Raton, FL
  • Print_ISBN
    978-1-4673-4651-1
  • Type

    conf

  • DOI
    10.1109/ICMLA.2012.65
  • Filename
    6406687