Processing continuous text queries featuring non-homogeneous scoring functions

  • Authors:
  • Nelly Vouzoukidou;Bernd Amann;Vassilis Christophides

  • Affiliations:
  • Pierre et Marie Curie University, Paris, France;Pierre et Marie Curie University, Paris, France;ICS/FORTH & University of Crete, Heraklion, Greece

  • Venue:
  • Proceedings of the 21st ACM international conference on Information and knowledge management
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this work we are interested in the scalable processing of content filtering queries over text item streams. In particular, we are aiming to generalize state of the art solutions with non-homogeneous scoring functions combining query-independent item importance with query-dependent content relevance. While such complex ranking functions are widely used in web search engines this is to our knowledge the first scientific work studying their usage in a continuous query scenario. Our main contribution consists in the definition and the evaluation of new efficient in-memory data structures for indexing continuous top-k queries based on an original two-dimensional representation of text queries. We are exploring locally-optimal score bounds and heuristics that efficiently prune the search space of candidate top-k query results which have to be updated at the arrival of new stream items. Finally, we experimentally evaluate memory/matching time trade-offs of these index structures. In particular we experimentally illustrate their linear scaling behavior with respect to the number of indexed queries.