Title :
Privacy Preserving C4.5 Algorithm over Vertically Distributed Datasets
Author :
Shen, Yanguang ; Shao, Hui ; Yang, Li
Author_Institution :
Sch. of Inf. Sci. & Electr. Eng., Hebei Univ. of Eng., Handan
Abstract :
It is a primary task in the privacy-preserving data mining in the distributed environment how to protect privacy and at the same time acquire accurate data relation. This paper shows how two parties built a decision tree collaboratively without revealing privacy when datasets is vertically distributed, including a PPC4.5 algorithm for privacy preserving via C4.5 over vertically distributed datasets and an algorithm of the best split attribute and the information gain ratio of the node. Further, the secure scalar product protocol and the x ln(x) protocol are used in collaborative computing, which can protect privacy effectively.
Keywords :
data mining; data privacy; decision trees; distributed processing; groupware; protocols; PPC4.5 algorithm; best split attribute; collaborative computing; data privacy; data relation; decision tree; distributed environment; information gain ratio; privacy preserving C4.5 algorithm; privacy-preserving data mining; scalar product protocol; vertically distributed datasets; Algorithm design and analysis; Collaboration; Data privacy; Databases; Decision trees; Distributed computing; Mathematical model; Partitioning algorithms; Protection; Protocols; C4.5; decision tree; privacy-preserving;
Conference_Titel :
Networks Security, Wireless Communications and Trusted Computing, 2009. NSWCTC '09. International Conference on
Conference_Location :
Wuhan, Hubei
Print_ISBN :
978-1-4244-4223-2
DOI :
10.1109/NSWCTC.2009.253