Enabling access-privacy for random walk based data analysis applications

  • Authors:
  • Ping Lin;K. Selçuk Candan

  • Affiliations:
  • Department of Computer Science and Engineering, Arizona State University, Tempe, AZ 85287-5406, USA;Department of Computer Science and Engineering, Arizona State University, Tempe, AZ 85287-5406, USA

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Random walk graph and Markov chain based models are used heavily in many data and system analysis domains, including web, bioinformatics, and queuing. These models enable the description and analysis of various behaviors of stochastic systems. If the system being modeled has certain properties, such as if it is irreducible and aperiodic, close form formulations corresponding to its stationary behavior can be used to analyze its behavior. However, if the system does not have these properties or if the user is not interested in the stationary behavior, then an iterative approach needs to be used to determine potential outcomes based on the initial probability distribution inputs to the model. In this paper, we focus on access-privacy enabled outsourced Markov chain based data analysis applications, where a non-trusted service provider takes (hidden) user queries that are described in terms of initial state distributions, and evaluates them iteratively in an oblivious manner. We show that this iterative process can leak information regarding the possible values of the hidden input if the server has a priori knowledge about the underlying Markovian process. Hence as opposed to simple obfuscation mechanisms, we develop an algorithm based on methodical addition of extra states, which guarantees unbounded feasible regions for the inputs, thus preventing a malicious host from having an informed guess regarding the inputs. In particular, we show that if the underlying transition matrix is diagonalizable then we can compute the exact number of states needed for access-privacy, while if the matrix is non-diagonalizable, then only a lower-bound can be computed.