Term clustering of syntactic phrases
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
The use of phrases and structured queries in information retrieval
SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
Filtered document retrieval with frequency-sorted indexes
Journal of the American Society for Information Science
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
Exploring the similarity space
ACM SIGIR Forum
Phrase recognition and expansion for short, precision-biased queries based on a query log
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Results and challenges in Web search evaluation
WWW '99 Proceedings of the eighth international conference on World Wide Web
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Scalable browsing for large collections: a case study
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Improving browsing in digital libraries with keyphrase indexes
Decision Support Systems - From information retrieval to knowledge management: enabling technologies and best practices
Interactive Internet search: keyword, directory and query reformulation mechanisms compared
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Searching the Web: the public and their queries
Journal of the American Society for Information Science and Technology
Vector-space ranking with effective early termination
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Optimised phrase querying and browsing of large text databases
ACSC '01 Proceedings of the 24th Australasian conference on Computer science
Operational requirements for scalable search systems
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
In-place versus re-build versus re-merge: index maintenance strategies for text retrieval systems
ACSC '04 Proceedings of the 27th Australasian conference on Computer science - Volume 26
Improving Web search efficiency via a locality based static pruning method
WWW '05 Proceedings of the 14th international conference on World Wide Web
Three-level caching for efficient query processing in large Web search engines
WWW '05 Proceedings of the 14th international conference on World Wide Web
A search engine for natural language applications
WWW '05 Proceedings of the 14th international conference on World Wide Web
A document-centric approach to static index pruning in text retrieval systems
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Heavy-tailed distributions and multi-keyword queries
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Locality-Based pruning methods for web search
ACM Transactions on Information Systems (TOIS)
Efficient phrase querying with common phrase index
Information Processing and Management: an International Journal
An intelligent information retrieval agent
Knowledge-Based Systems
TinyLex: static n-gram index pruning with perfect recall
Proceedings of the 17th ACM conference on Information and knowledge management
Can phrase indexing help to process non-phrase queries?
Proceedings of the 17th ACM conference on Information and knowledge management
Optimization issues in inverted index-based entity annotation
Proceedings of the 3rd international conference on Scalable information systems
Out of the Box Phrase Indexing
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
An intelligent agent for information retrieval
ACST '08 Proceedings of the Fourth IASTED International Conference on Advances in Computer Science and Technology
Static pruning of terms in inverted files
ECIR'07 Proceedings of the 29th European conference on IR research
Index structures for efficiently searching natural language text
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Efficient term proximity search with term-pair indexes
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Cube index for unstructured text analysis and mining
Proceedings of the 2011 International Conference on Communication, Computing & Security
A scalable real-time search engine for fast retrieval of social media content
Proceedings of the 2nd international workshop on Ubiquitous crowdsouring
Efficient phrase querying with flat position index
Proceedings of the 20th ACM international conference on Information and knowledge management
High-performance processing of text queries with tunable pruned term and term pair indexes
ACM Transactions on Information Systems (TOIS)
Efficient phrase querying with common phrase index
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Hi-index | 0.00 |
Search engines need to evaluate queries extremely fast, a challenging task given the vast quantities of data being indexed. A significant proportion of the queries posed to search engines involve phrases. In this paper we consider how phrase queries can be efficiently supported with low disk overheads. Previous research has shown that phrase queries can be rapidly evaluated using nextword indexes, but these indexes are twice as large as conventional inverted files. We propose a combination of nextword indexes with inverted files as a solution to this problem. Our experiments show that combined use of an auxiliary nextword index and a conventional inverted file allow evaluation of phrase queries in half the time required to evaluate such queries with an inverted file alone, and the space overhead is only 10% of the size of the inverted file. Further time savings are available with only slight increases in disk requirements.