Optimization for dynamic inverted index maintenance
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Incremental updates of inverted lists for text document retrieval
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
The log-structured merge-tree (LSM-tree)
Acta Informatica
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Concurrency Control in B-Trees with Batch Updates
IEEE Transactions on Knowledge and Data Engineering
A Generic Approach to Bulk Loading Multidimensional Index Structures
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Efficient Search of Multi-Dimensional B-Trees
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
The LHAM log-structured history data access method
The VLDB Journal — The International Journal on Very Large Data Bases
Dynamic maintenance of web indexes using landmarks
WWW '03 Proceedings of the 12th international conference on World Wide Web
Generation Scavenging: A non-disruptive high performance storage reclamation algorithm
SDE 1 Proceedings of the first ACM SIGSOFT/SIGPLAN software engineering symposium on Practical software development environments
B-tree indexes for high update rates
ACM SIGMOD Record
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Hybrid index maintenance for growing text collections
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Information Retrieval
Introduction to Information Retrieval
Hi-index | 0.00 |
Many scenarios impose a heavy update load on B-tree indexes in modern databases. A typical case is when B-trees are used for indexing all the keywords of a text field. For example upon the insertion of a new text record (e.g. a new document arrives), a barrage of new keywords has to be inserted into the index causing many random disk I/Os and interrupting the normal operation of the database. The common approach has been to collect the updates in a separate structure and then perform a batch update of the index. This update "freezes" the database. Many applications, however, require the immediate availability of the new updates without any interruption of the normal database operation. In this paper we present a novel online B-tree update method based on a new buffering data structure we introduce - Dynamic Bucket Tree (DBT). The DBT-buffer serves as a differential index for new updates. The grouping of keys in DBT-buffer is based on the longest common prefixes (LCP) of their binary representations. The LCP is used as a measure of the locality of keys to be transferred to the main B-tree. Our online update system does not slow down concurrent user transactions or lead to degradation of search performance. Experiments confirm that our DBT buffer can be efficiently used for online updates of text fields. As such it represents an effective solution to the notorious problem of handling updates to an Inverted Index.