Title :
Density-Induced Support Vector Data Description
Author :
Lee, KiYoung ; Kim, Dae-Won ; Lee, Kwang H. ; Lee, Doheon
Author_Institution :
Dept. of BioSystems, Korea Adv. Inst. of Sci. & Technol. (KAIST), Daejeon
Abstract :
The purpose of data description is to give a compact description of the target data that represents most of its characteristics. In a support vector data description (SVDD), the compact description of target data is given in a hyperspherical model, which is determined by a small portion of data called support vectors. Despite the usefulness of the conventional SVDD, however, it may not identify the optimal solution of target description especially when the support vectors do not have the overall characteristics of the target data. To address the issue in SVDD methodology, we propose a new SVDD by introducing new distance measurements based on the notion of a relative density degree for each data point in order to reflect the distribution of a given data set. Moreover, for a real application, we extend the proposed method for the protein localization prediction problem which is a multiclass and multilabel problem. Experiments with various real data sets show promising results
Keywords :
computational complexity; data description; data handling; distance measurement; proteins; quadratic programming; support vector machines; compact description; density-induced support vector data description; distance measurements; hyperspherical model; protein localization prediction problem; Clustering methods; Computer science; Distance measurement; Laboratories; Object detection; Proteins; Support vector machine classification; Support vector machines; Systems biology; Training data; Data domain description; density-induced support vector data description (D-SVDD); one-class classification; outlier detection; support vector data description (SVDD); Algorithms; Artificial Intelligence; Computer Simulation; Data Interpretation, Statistical; Databases, Factual; Information Storage and Retrieval; Models, Statistical; Pattern Recognition, Automated; Statistical Distributions;
Journal_Title :
Neural Networks, IEEE Transactions on
DOI :
10.1109/TNN.2006.884673