Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
The use of phrases and structured queries in information retrieval
SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
Optimizing queries over multimedia repositories
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Filtered document retrieval with frequency-sorted indexes
Journal of the American Society for Information Science
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
Inverted files versus signature files for text indexing
ACM Transactions on Database Systems (TODS)
Exploring the similarity space
ACM SIGIR Forum
Compressed inverted files with reduced decoding overheads
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Phrase recognition and expansion for short, precision-biased queries based on a query log
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Scalable browsing for large collections: a case study
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Searching the Web: the public and their queries
Journal of the American Society for Information Science and Technology
Vector-space ranking with effective early termination
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval Systems: Theory and Implementation
Information Retrieval Systems: Theory and Implementation
Efficient phrase querying with an auxiliary index
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient phrase querying with common phrase index
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Hi-index | 0.00 |
In this paper, we propose a common phrase index as an efficient index structure to support phrase queries in a very large text database. Our structure is an extension of previous index structures for phrases and achieves better query efficiency with modest extra storage cost. Further improvement in efficiency can be attained by implementing our index according to our observation of the dynamic nature of common word set. In experimental evaluation, a common phrase index using 255 common words has an improvement of about 11% and 62% in query time for the overall and large queries (queries of long phrases) respectively over an auxiliary nextword index. Moreover, it has only about 19% extra storage cost. Compared with an inverted index, our improvement is about 72% and 87% for the overall and large queries respectively. We also propose to implement a common phrase index with dynamic update feature. Our experiments show that more improvement in time efficiency can be achieved.