Storage management for objects in EXODUS
Object-oriented concepts, databases, and applications
The Starburst long field manager
VLDB '89 Proceedings of the 15th international conference on Very large data bases
Optimization for dynamic inverted index maintenance
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
The performance of three database storage structures for managing large objects
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Synthetic workload performance analysis of incremental updates
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Incremental updates of inverted lists for text document retrieval
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Overview of the second text retrieval conference (TREC-2)
TREC-2 Proceedings of the second conference on Text retrieval conference
In situ generation of compressed inverted files
Journal of the American Society for Information Science
Filtered document retrieval with frequency-sorted indexes
Journal of the American Society for Information Science
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
Inverted files versus signature files for text indexing
ACM Transactions on Database Systems (TODS)
Compressed inverted files with reduced decoding overheads
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Efficient passage ranking for document databases
ACM Transactions on Information Systems (TOIS)
ACM Computing Surveys (CSUR)
Searching the Web: the public and their queries
Journal of the American Society for Information Science and Technology
Burst tries: a fast, efficient data structure for string keys
ACM Transactions on Information Systems (TOIS)
In-memory hash tables for accumulating text vocabularies
Information Processing Letters
Modern Information Retrieval
Impact transformation: effective and efficient web retrieval
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Compression of inverted indexes For fast query evaluation
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Object and File Management in the EXODUS Extensible Database System
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
An Efficient Database Storage Structure for Large Dynamic Objects
Proceedings of the Eighth International Conference on Data Engineering
Efficient single-pass index construction for text databases
Journal of the American Society for Information Science and Technology
ACSC '04 Proceedings of the 27th Australasian conference on Computer science - Volume 26
In-place versus re-build versus re-merge: index maintenance strategies for text retrieval systems
ACSC '04 Proceedings of the 27th Australasian conference on Computer science - Volume 26
Fast phrase querying with combined indexes
ACM Transactions on Information Systems (TOIS)
Inverted Index Compression Using Word-Aligned Binary Codes
Information Retrieval
A statistics-based approach to incrementally update inverted files
Information Processing and Management: an International Journal
Hybrid index maintenance for growing text collections
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient in-memory extensible inverted file
Information Systems
Just in time indexing for up to the second search
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Efficient on-line index maintenance for dynamic text collections by using dynamic balancing tree
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Hybrid index maintenance for contiguous inverted lists
Information Retrieval
Efficient online index construction for text databases
ACM Transactions on Database Systems (TODS)
Efficient Index Maintenance for Frequently Updated Semantic Data
ASWC '08 Proceedings of the 3rd Asian Semantic Web Conference on The Semantic Web
Semplore: A scalable IR approach to search the Web of Data
Web Semantics: Science, Services and Agents on the World Wide Web
On-line index maintenance using horizontal partitioning
Proceedings of the 18th ACM conference on Information and knowledge management
Low-cost management of inverted files for online full-text search
Proceedings of the 18th ACM conference on Information and knowledge management
Efficient answering of set containment queries for skewed item distributions
Proceedings of the 14th International Conference on Extending Database Technology
Scalable, statistical storage allocation for extensible inverted file construction
Journal of Systems and Software
Brief announcement: large-scale multimaps
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
A framework for utilising usage trends in the crawling and indexing process of search engines
International Journal of Knowledge and Web Intelligence
Index maintenance for time-travel text search
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Cache-Oblivious dictionaries and multimaps with negligible failure probability
MedAlg'12 Proceedings of the First Mediterranean conference on Design and Analysis of Algorithms
Fast candidate generation for real-time tweet search with bloom filter chains
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
Search engines and other text retrieval systems use high-performance inverted indexes to provide efficient text query evaluation. Algorithms for fast query evaluation and index construction are well-known, but relatively little has been published concerning update. In this paper, we experimentally evaluate the two main alternative strategies for index maintenance in the presence of insertions, with the constraint that inverted lists remain contiguous on disk for fast query evaluation. The in-place and re-merge strategies are benchmarked against the baseline of a complete re-build. Our experiments with large volumes of web data show that re-merge is the fastest approach if large buffers are available, but that even a simple implementation of in-place update is suitable when the rate of insertion is low or memory buffer size is limited. We also show that with careful design of aspects of implementation such as free-space management, in-place update can be improved by around an order of magnitude over a naïve implementation.