Reservoir sampling techniques in modern data analysis

  • Authors:
  • Anže Pečar;Miha Zidar;Matjaž Kukar

  • Affiliations:
  • University of Ljubljena, Ljubljana, Slovenia;University of Ljubljena, Ljubljana, Slovenia;University of Ljubljena, Ljubljana, Slovenia

  • Venue:
  • Proceedings of the Fifth Balkan Conference in Informatics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reservoir sampling is an interesting statistical sampling technique, developed almost 40 years ago in order to enable analysis of large scale data (for that time) while utilizing limited computer memory resources. We present an overview of frequently used reservoir sampling techniques and discuss how they can be used for learning from data streams. While they are not perfect for all scenarios, they can easily be modified for many purpose, and also find place in surprisingly useful modern data analysis approaches.