Privacy preserving ID3 using Gini Index over horizontally partitioned data

  • Authors:
  • Saeed Samet;Ali Miri

  • Affiliations:
  • School of Information Technology and Engineering (SITE), University of Ottawa, Canada;School of Information Technology and Engineering (SITE), University of Ottawa, Canada

  • Venue:
  • AICCSA '08 Proceedings of the 2008 IEEE/ACS International Conference on Computer Systems and Applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The ID3 algorithm is a standard, popular, and simple method for data classification and decision tree creation. Since privacy-preserving data mining should be taken into consideration, several secure multi-party computation protocols have been presented based on this technique. Entropy and Gini Index are two protocols which compute Information-Gain at each step when producing a decision tree. The Gini Index, however, has been less studied in privacy-preserving data mining protocols. In this paper, we show how Gini can be used in privacy-preserving ID3 algorithms to create decision tree classifications in such a way that involved parties can jointly compute the gain value of each normal attribute without revealing their own private information to each other, while the database is horizontally partitioned over two or more parties. Three secure multiparty sub-protocols are presented to evaluate the intermediate computations. The communication overhead has been kept reasonably low to make the whole protocol efficient and practical.