Dynamic memory allocation policies for postings in real-time Twitter search

  • Authors:
  • Nima Asadi;Jimmy Lin;Michael Busch

  • Affiliations:
  • University of Maryland, College Park, MD, USA;University of Maryland, College Park, MD, USA;Twitter, San Francisco, CA, USA

  • Venue:
  • Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We explore a real-time Twitter search application where tweets are arriving at a rate of several thousands per second. Real-time search demands that they be indexed and searchable immediately, which leads to a number of implementation challenges. In this paper, we focus on one aspect: dynamic postings allocation policies for index structures that are completely held in main memory. The core issue can be characterized as a "Goldilocks Problem". Because memory remains today a scare resource, an allocation policy that is too aggressive leads to inefficient utilization, while a policy that is too conservative is slow and leads to fragmented postings lists. We present a dynamic postings allocation policy that allocates memory in increasingly-larger "slices" from a small number of large, fixed pools of memory. With an analytical model and experiments, we explore different settings that balance time (query evaluation speed) and space (memory utilization).