Mining effective multi-segment sliding window for pathogen incidence rate prediction

  • Authors:
  • Lei Duan;Changjie Tang;Xiaosong Li;Guozhu Dong;Xianming Wang;Jie Zuo;Min Jiang;Zhongqi Li;Yongqing Zhang

  • Affiliations:
  • School of Computer Science, Sichuan University, Chengdu 610065, China;School of Computer Science, Sichuan University, Chengdu 610065, China and National Key Laboratory of Air Traffic Control Automation System Technology, Chengdu 610065, China;West China School of Public Health, Sichuan University, Chengdu 610041, China;Department of Computer Science & Engineering, Wright State University, Dayton 45435, USA;School of Computer Science, Sichuan University, Chengdu 610065, China;School of Computer Science, Sichuan University, Chengdu 610065, China;West China School of Public Health, Sichuan University, Chengdu 610041, China;School of Computer Science, Sichuan University, Chengdu 610065, China;School of Computer Science, Sichuan University, Chengdu 610065, China

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Pathogen incidence rate prediction, which can be considered as time series modeling, is an important task for infectious disease incidence rate prediction and for public health. This paper investigates the application of a genetic computation technique, namely GEP, for pathogen incidence rate prediction. To overcome the shortcomings of traditional sliding windows in GEP-based time series modeling, the paper introduces the problem of mining effective sliding window, for discovering optimal sliding windows for building accurate prediction models. To utilize the periodical characteristic of pathogen incidence rates, a multi-segment sliding window consisting of several segments from different periodical intervals is proposed and used. Since the number of such candidate windows is still very large, a heuristic method is designed for enumerating the candidate effective multi-segment sliding windows. Moreover, methods to find the optimal sliding window and then produce a mathematical model based on that window are proposed. A performance study on real-world datasets shows that the techniques are effective and efficient for pathogen incidence rate prediction.