Title :
Posterior probability support vector Machines for unbalanced data
Author :
Tao, Qing ; Wu, Gao-Wei ; Wang, Fei-Yue ; Wang, Jue
Author_Institution :
Key Lab. of Complex Syst. & Intelligence Sci., Chinese Acad. of Sci., Beijing, China
Abstract :
This paper proposes a complete framework of posterior probability support vector machines (PPSVMs) for weighted training samples using modified concepts of risks, linear separability, margin, and optimal hyperplane. Within this framework, a new optimization problem for unbalanced classification problems is formulated and a new concept of support vectors established. Furthermore, a soft PPSVM with an interpretable parameter ν is obtained which is similar to the ν-SVM developed by Schölkopf et al., and an empirical method for determining the posterior probability is proposed as a new approach to determine ν. The main advantage of an PPSVM classifier lies in that fact that it is closer to the Bayes optimal without knowing the distributions. To validate the proposed method, two synthetic classification examples are used to illustrate the logical correctness of PPSVMs and their relationship to regular SVMs and Bayesian methods. Several other classification experiments are conducted to demonstrate that the performance of PPSVMs is better than regular SVMs in some cases. Compared with fuzzy support vector machines (FSVMs), the proposed PPSVM is a natural and an analytical extension of regular SVMs based on the statistical learning theory.
Keywords :
inference mechanisms; learning (artificial intelligence); optimisation; pattern classification; support vector machines; Bayes optimal; Bayesian decision theory; Bayesian methods; PPSVM classifier; fuzzy support vector machines; interpretable parameter; linear separability; logical correctness; maximal margin algorithms; optimal hyperplane; optimization problem; posterior probability support vector machines; statistical learning theory; unbalanced classification; unbalanced data; weighted training samples; Automation; Bayesian methods; Decision theory; Intelligent systems; Laboratories; Machine learning algorithms; Probability; Statistical learning; Support vector machine classification; Support vector machines; Bayesian decision theory; classification; margin; maximal margin algorithms; posterior probability; support vector machines (SVMs); unbalanced data; Algorithms; Artificial Intelligence; Computer Simulation; Databases, Factual; Information Storage and Retrieval; Models, Statistical; Pattern Recognition, Automated;
Journal_Title :
Neural Networks, IEEE Transactions on
DOI :
10.1109/TNN.2005.857955