Title :
Privacy preserving classification by using modified C4.5
Author :
Baghel, Ranjan ; Dutta, Maitreyee
Author_Institution :
Dept. of Comput. Sci. & Eng., Nat. Inst. of Tech. Teachers Training & Res., Chandigarh, India
Abstract :
Protecting the datasets supplied to third parties for data mining purposes is essential so that these datasets cannot be used for secondary purposes. C4.5 is a classification algorithm which works on mixed datasets. Data perturbation is an important technique in data privacy. This paper proposes a modified C4.5 which uses perturbed and unrealized datasets for classification. The decision tree is built by using the gain ratio as the split criteria and it is computed using the unreal and perturbed datasets. Experimental results are obtained by simulation in Weka.
Keywords :
data privacy; decision trees; pattern classification; Weka; classification algorithm; data perturbation; data privacy; decision tree; gain ratio; modified C4.5; perturbed datasets; privacy preserving classification; unreal datasets; Accuracy; Data privacy; Databases; Decision trees; Entropy; Humidity; Training; C4.5; Classification; Data Mining; PPDM; Privacy;
Conference_Titel :
Contemporary Computing (IC3), 2013 Sixth International Conference on
Conference_Location :
Noida
Print_ISBN :
978-1-4799-0190-6
DOI :
10.1109/IC3.2013.6612175