Title :
Application of data mining to candidate screening
Author :
Hudli, Shrihari A. ; Hudli, Anand V. ; Hudli, Aditi A.
Author_Institution :
Comput. Sci. Dept., MS Ramaiah Inst. of Technol., Bangalore, India
Abstract :
Classification models are supervised learning methods used for predicting the value of a categorical target attribute. These models use a set of examples, called the training set, to learn to predict the target class of a future example whose class is unknown. The development of learning algorithms capable of learning from past experience is an important step in emulating inductive learning in humans. Classification finds application in many domains, of which selection of customers for a marketing campaign, fraud detection, diagnosis of diseases, image recognition, and spam e-mail filtering are just a few examples. In this paper, we present a comparative study of the application of the Naive Bayes and k-Nearest Neighbors (kNN) classification methods to the problem of screening candidates for a vacant position in an organization. The observable attributes of a candidate profile are first established. In the training phase, a training set of example profiles is used for learning the classification rules of the organization. In the test phase, the accuracy of the classification model is assessed by classifying example profiles not included in the training set, but for which the target class is already known. In the prediction phase, a profile whose target class is not known is classified. We present the results of the comparison of the Naive Bayes method with the kNN method.
Keywords :
Bayes methods; data mining; learning (artificial intelligence); Naive Bayes classification methods; candidate profile; candidate screening; categorical target attribute; classification rules; data mining; diseases diagnosis; fraud detection; image recognition; k-nearest neighbors methods; kNN method; marketing campaign; prediction phase; spam e-mail filtering; supervised learning methods; target class prediction; training set; Accuracy; Classification algorithms; Diseases; Niobium; Candidate Screening; Classification Models; Data Mining; Machine Learning; Naive Bayes Learning; k-Nearest Neighbor Algorithm;
Conference_Titel :
Advanced Communication Control and Computing Technologies (ICACCCT), 2012 IEEE International Conference on
Conference_Location :
Ramanathapuram
Print_ISBN :
978-1-4673-2045-0
DOI :
10.1109/ICACCCT.2012.6320788