Title :
RankRC: Large-Scale Nonlinear Rare Class Ranking
Author :
Tayal, Aditya ; Coleman, Thomas F. ; Yuying Li
Author_Institution :
Cheriton Sch. of Comput. Sci., Univ. of Waterloo, Waterloo, ON, Canada
Abstract :
Rare class problems are common in real-world applications across a wide range of domains. Standard classification algorithms are known to perform poorly in these cases, since they focus on overall classification accuracy. In addition, we have seen a significant increase of data in recent years, resulting in many large scale rare class problems. In this paper, we focus on nonlinear kernel based classification methods expressed as a regularized loss minimization problem. We address the challenges associated with both rare class problems and large scale learning, by 1) optimizing area under curve of the receiver of operator characteristic in the training process, instead of classification accuracy and 2) using a rare class kernel representation to achieve an efficient time and space algorithm. We call the algorithm RankRC. We provide justifications for the rare class representation and experimentally illustrate the effectiveness of RankRC in test performance, computational complexity, and model robustness.
Keywords :
computational complexity; learning (artificial intelligence); minimisation; pattern classification; RankRC algorithm; computational complexity; large scale learning; large-scale nonlinear rare class ranking; nonlinear kernel based classification methods; rare class kernel representation; receiver of operator characteristic; regularized loss minimization problem; space algorithm; standard classification algorithms; time algorithm; training process; Approximation methods; Computational modeling; Error analysis; Kernel; Machine learning; Support vector machines; Machine learning; imbalanced classification; imbalanced classification,; kernel-based learning; large-scale algorithms; ranking loss;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2015.2453171