Title :
Privacy preserving ID3 using Gini Index over horizontally partitioned data
Author :
Samet, Saeed ; Miri, Ali
Author_Institution :
Univ. of Ottawa, Ottawa
fDate :
March 31 2008-April 4 2008
Abstract :
The ID3 algorithm is a standard, popular, and simple method for data classification and decision tree creation. Since privacy-preserving data mining should be taken into consideration, several secure multi-party computation protocols have been presented based on this technique. Entropy and Gini Index are two protocols which compute information-gain at each step when producing a decision tree. The Gini index, however, has been less studied in privacy-preserving data mining protocols. In this paper, we show how Gini can be used in privacy-preserving ID3 algorithms to create decision tree classifications in such a way that involved parties can jointly compute the gain value of each normal attribute without revealing their own private information to each other, while the database is horizontally partitioned over two or more parties. Three secure multiparty sub-protocols are presented to evaluate the intermediate computations. The communication overhead has been kept reasonably low to make the whole protocol efficient and practical.
Keywords :
data mining; data privacy; decision trees; security of data; Gini index; ID3 algorithm privacy; communication overhead; data classification; decision tree creation; horizontally partitioned data; information-gain computation; multiparty computation protocols; multiparty subprotocol security; privacy-preserving data mining; Classification tree analysis; Data engineering; Data mining; Data privacy; Decision trees; Entropy; Impurities; Information technology; Partitioning algorithms; Protocols;
Conference_Titel :
Computer Systems and Applications, 2008. AICCSA 2008. IEEE/ACS International Conference on
Conference_Location :
Doha
Print_ISBN :
978-1-4244-1967-8
Electronic_ISBN :
978-1-4244-1968-5
DOI :
10.1109/AICCSA.2008.4493598