Deciding what to observe next: adaptive variable selection for regression in multivariate data streams

  • Authors:
  • Christoforos Anagnostopoulos;Niall M. Adams;David J. Hand

  • Affiliations:
  • Imperial College London, South Kensington, London, UK;Imperial College London, South Kensington Campus, London, UK;Imperial College London, South Kensington, London, UK

  • Venue:
  • Proceedings of the 2008 ACM symposium on Applied computing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Variable selection can be valuable in the analysis of streaming data with costly measurements, as in intensive care monitoring or battery-powered sensor networks. In the presence of drift, selections must be constantly revised, calling for adaptive variable selection schemes. An important and novel problem arises from the fact that non-selected variables become missing variables, which induces bias upon subsequent decisions. Here, we consider adaptive variable selection in the context of linear regression, using only a fraction of the available regressors per timepoint. We suggest a scheme that fits a multivariate Gaussian over a sliding window using the EM algorithm and selects which variables to observe next using the Lasso algorithm. We experiment with simulated and real data to demonstrate that very high prediction accuracy may be retained using as little as 10% of the data.