Privacy preserving feature selection for distributed data using virtual dimension

Authors:
Madhushri Banerjee;Sumit Chakravarty
Affiliations:
Georgia Gwinnett College, Lawrenceville, GA, USA;Stinger & Ghaffarian Technologies Inc., Greenbelt, MD, USA
Venue:
Proceedings of the 20th ACM international conference on Information and knowledge management
Year:
2011

Citing 7
Cited 0

Privacy Preserving Data Mining

CRYPTO '00 Proceedings of the 20th Annual International Cryptology Conference on Advances in Cryptology
Privacy preserving association rule mining in vertically partitioned data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Hyperspectral Imaging: Techniques for Spectral Detection and Classification

Hyperspectral Imaging: Techniques for Spectral Detection and Classification
Privacy Preserving Data Mining (Advances in Information Security)

Privacy Preserving Data Mining (Advances in Information Security)
Privacy-preserving distributed k-means clustering over arbitrarily partitioned data

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Privacy-preserving Naïve Bayes classification

The VLDB Journal — The International Journal on Very Large Data Bases
How to generate and exchange secrets

SFCS '86 Proceedings of the 27th Annual Symposium on Foundations of Computer Science

Quantified Score

Hi-index	0.02

Visualization

Abstract

Data Mining often suffers from the curse of dimensionality. Huge numbers of dimensions or attributes in the data pose serious problems to the data mining tasks. Traditionally data dimensionality reduction techniques like Principal Component Analysis have been used to address this problem.However, the need might be to remain in the original attribute space and identify the key predictive attributes instead of moving to a transformed space. As a result feature subset selection has become an important area of research over the last few years. With the advent of network technologies data is sometimes distributed in multiple locations and often with multiple parties. The biggest concern while sharing data is data privacy. Here, in this paper a secure distributed protocol is proposed that will allow feature selection for multiple parties without revealing their own data. The proposed distributed feature selection method has evolved from a method called virtual dimension reduction used in the field of hyperspectral image processing for selection of subset of hyperspectral bands for further analysis. The experimental results with real life datasets presented in this paper will demonstrate the effectiveness of the proposed method.