Document listing for queries with excluded pattern

  • Authors:
  • Wing-Kai Hon;Rahul Shah;Sharma V. Thankachan;Jeffrey Scott Vitter

  • Affiliations:
  • National Tsing Hua University, Taiwan;Louisiana State University;Louisiana State University;The University of Kansas

  • Venue:
  • CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Let $\mathcal D$ ={d1,d2,...,dD} be a given collection of D string documents of total length n. We consider the problem of indexing $\mathcal D$ such that, whenever two patterns P+ and P− comes as an online query, we can list all those documents containing P+ but not P−. Let t represent the number of such documents. An index proposed by Fischer et al. (LATIN, 2012) can answer this query in $O(|P^+|+|P^-|+t+\sqrt{n})$ time. However, its space requirement is O(n3/2) bits. We propose the first linear-space index for this problem with a worst case query time of $O(|P^+|+|P^-|+\sqrt{n}\log \log n+\sqrt{nt}\log^{2.5} n)$.