Storage management for objects in EXODUS
Object-oriented concepts, databases, and applications
The Starburst long field manager
VLDB '89 Proceedings of the 15th international conference on Very large data bases
Optimization for dynamic inverted index maintenance
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
The performance of three database storage structures for managing large objects
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Synthetic workload performance analysis of incremental updates
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Incremental updates of inverted lists for text document retrieval
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
In situ generation of compressed inverted files
Journal of the American Society for Information Science
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Searching the Web: the public and their queries
Journal of the American Society for Information Science and Technology
Compression of inverted indexes For fast query evaluation
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Object and File Management in the EXODUS Extensible Database System
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Fast Incremental Indexing for Full-Text Information Retrieval
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An Efficient Database Storage Structure for Large Dynamic Objects
Proceedings of the Eighth International Conference on Data Engineering
Efficient single-pass index construction for text databases
Journal of the American Society for Information Science and Technology
In-place versus re-build versus re-merge: index maintenance strategies for text retrieval systems
ACSC '04 Proceedings of the 27th Australasian conference on Computer science - Volume 26
Inverted Index Compression Using Word-Aligned Binary Codes
Information Retrieval
Indexing time vs. query time: trade-offs in dynamic information retrieval systems
Proceedings of the 14th ACM international conference on Information and knowledge management
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Hybrid index maintenance for growing text collections
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A pipelined architecture for distributed text query evaluation
Information Retrieval
Efficient on-line index maintenance for dynamic text collections by using dynamic balancing tree
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Hybrid index maintenance for contiguous inverted lists
Information Retrieval
Efficient online index construction for text databases
ACM Transactions on Database Systems (TODS)
Supporting sub-document updates and queries in an inverted index
Proceedings of the 17th ACM conference on Information and knowledge management
Spyglass: fast, scalable metadata search for large-scale storage systems
FAST '09 Proccedings of the 7th conference on File and storage technologies
On-line index maintenance using horizontal partitioning
Proceedings of the 18th ACM conference on Information and knowledge management
Low-cost management of inverted files for online full-text search
Proceedings of the 18th ACM conference on Information and knowledge management
Scalable online index construction with multi-core CPUs
ADC '10 Proceedings of the Twenty-First Australasian Conference on Database Technologies - Volume 104
Search in social networks with access control
Proceedings of the 2nd International Workshop on Keyword Search on Structured Data
Index tuning for query-log based on-line index maintenance
Proceedings of the 20th ACM international conference on Information and knowledge management
A framework for utilising usage trends in the crawling and indexing process of search engines
International Journal of Knowledge and Web Intelligence
A hybrid approach to index maintenance in dynamic text retrieval systems
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Sorting on GPUs for large scale datasets: A thorough comparison
Information Processing and Management: an International Journal
Indexing dataspaces with partitions
World Wide Web
Hi-index | 0.01 |
Inverted index structures are the mainstay of modern text retrieval systems. They can be constructed quickly using off-line merge-based methods, and provide efficient support for a variety of querying modes. In this paper we examine the task of on-line index construction -- that is, how to build an inverted index when the underlying data must be continuously queryable, and the documents must be indexed and available for search as soon they are inserted. When straightforward approaches are used, document insertions become increasingly expensive as the size of the database grows. This paper describes a mechanism based on controlled partitioning that can be adapted to suit different balances of insertion and querying operations, and is faster and scales better than previous methods. Using experiments on 100GB of web data we demonstrate the efficiency of our methods in practice, showing that they dramatically reduce the cost of on-line index construction.