An efficient and practical algorithm for the many-keyword proximity problem by offsets

  • Authors:
  • Sung-Ryul Kim;Jiman Hong

  • Affiliations:
  • Division of Internet & Media and CAESIT, Konkuk University;School of Computer Science and Engineering, Kwangwoon University

  • Venue:
  • RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part II
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the most important relevance factors in the Web search context is the proximity score, i.e., how close the keywords appear together in a given document. A basic proximity score is given by the size of the smallest range containing all the keywords in the query. We generalize the proximity score to include many practically important cases and present an O(n log k) time algorithm for the generalized problem, where k is the number of keywords and n is the number of occurrences of the keywords in a document.