Privacy-preserving cox regression for survival analysis

Authors:
Shipeng Yu;Glenn Fung;Romer Rosales;Sriram Krishnan;R. Bharat Rao;Cary Dehing-Oberije;Philippe Lambin
Affiliations:
Siemens Medical Solutions USA, Inc., Malvern, PA, USA;Siemens Medical Solutions USA, Inc., Malvern, PA, USA;Siemens Medical Solutions USA, Inc., Malvern, PA, USA;Siemens Medical Solutions USA, Inc., Malvern, PA, USA;Siemens Medical Solutions USA, Inc., Malvern, PA, USA;University Hospital Maastricht, Maastricht, Netherlands;University Hospital Maastricht, Maastricht, Netherlands
Venue:
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2008

Citing 9
Cited 2

Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
State-of-the-art in privacy preserving data mining

ACM SIGMOD Record
Random Projection-Based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining

IEEE Transactions on Knowledge and Data Engineering
Privacy Preserving Data Classification with Rotation Perturbation

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Privacy Preserving ID3 Algorithm over Horizontally Partitioned Data

PDCAT '05 Proceedings of the Sixth International Conference on Parallel and Distributed Computing Applications and Technologies
Privacy-preserving SVM using nonlinear kernels on horizontally partitioned data

Proceedings of the 2006 ACM symposium on Applied computing
Learning sparse metrics via linear programming

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Cryptographically private support vector machines

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Regression Modeling Strategies

Regression Modeling Strategies

Anonymizing healthcare data: a case study on the blood transfusion service

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy-preserving models for comparing survival curves using the logrank test

Computer Methods and Programs in Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Privacy-preserving data mining (PPDM) is an emergent research area that addresses the incorporation of privacy preserving concerns to data mining techniques. In this paper we propose a privacy-preserving (PP) Cox model for survival analysis, and consider a real clinical setting where the data is horizontally distributed among different institutions. The proposed model is based on linearly projecting the data to a lower dimensional space through an optimal mapping obtained by solving a linear programming problem. Our approach differs from the commonly used random projection approach since it instead finds a projection that is optimal at preserving the properties of the data that are important for the specific problem at hand. Since our proposed approach produces an sparse mapping, it also generates a PP mapping that not only projects the data to a lower dimensional space but it also depends on a smaller subset of the original features (it provides explicit feature selection). Real data from several European healthcare institutions are used to test our model for survival prediction of non-small-cell lung cancer patients. These results are also confirmed using publicly available benchmark datasets. Our experimental results show that we are able to achieve a near-optimal performance without directly sharing the data across different data sources. This model makes it possible to conduct large-scale multi-centric survival analysis without violating privacy-preserving requirements.