An efficient and practical algorithm for the many-keyword proximity problem by offsets

Authors:
Sung-Ryul Kim;Jiman Hong
Affiliations:
Division of Internet & Media and CAESIT, Konkuk University;School of Computer Science and Engineering, Kwangwoon University
Venue:
RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part II
Year:
2005

Citing 6
Cited 0

An algorithm for string matching with a sequence of don't cares

Information Processing Letters
New indices for text: PAT Trees and PAT arrays

Information retrieval
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
The ADT Proximity and Text Proximity Problems

SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
Searching the Web: general and scientific information access

IEEE Communications Magazine

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the most important relevance factors in the Web search context is the proximity score, i.e., how close the keywords appear together in a given document. A basic proximity score is given by the size of the smallest range containing all the keywords in the query. We generalize the proximity score to include many practically important cases and present an O(n log k) time algorithm for the generalized problem, where k is the number of keywords and n is the number of occurrences of the keywords in a document.