Algorithm combination for improved performance in biosurveillance systems

  • Authors:
  • Inbal Yahav;Galit Shmueli

  • Affiliations:
  • Department of Decision & Information Technologies and Center for Health Information and Decision Systems, Robert H Smith School of Business, University of Maryland, College Park, MD;Department of Decision & Information Technologies and Center for Health Information and Decision Systems, Robert H Smith School of Business, University of Maryland, College Park, MD

  • Venue:
  • BioSurveillance'07 Proceedings of the 2nd NSF conference on Intelligence and security informatics: BioSurveillance
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The majority of statistical research on detecting disease outbreaks from prediagnostic data has focused on tools for modeling background behavior of such data, and for monitoring the data for anomaly detection. Because prediagnostic data tends to include explainable patterns such as day-of-week, seasonality, and holiday effects, the monitoring process often calls for a two-step algorithm: first, a preprocessing technique is used for deriving a residual series, and then the residuals are monitored using a classic control chart. Most studies tend to apply a single combination of a pre-processing technique with a particular control chart to a particular type of data. Although the choice of preprocessing technique should be driven by the nature of the non-outbreak data and the choice of the control chart by the nature of the outbreak to be detected, often the nature of both is non-stationary and unclear, and varies considerable across different data series. We therefore take an approach that combines algorithms rather than choosing a single one. In particular, we propose a method for combining multiple preprocessing algorithms and a method for combining multiple control charts, both based on linear-programming. We show preliminary results for combining pre-processing techniques, applied to both simulated and authentic syndromic data.