Full text databases
Information retrieval: data structures and algorithms
Information retrieval: data structures and algorithms
Incremental updates of inverted lists for text document retrieval
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric
Journal of the ACM (JACM)
Signature files: an access method for documents and its analytical performance evaluation
ACM Transactions on Information Systems (TOIS)
Modern Information Retrieval
Hi-index | 0.00 |
This paper proposes two new character-based full-text indexing models, i.e., adjacency matrix based inverted file and adjacency matrix based PAT array. Formally, the former is a kind of reorganization of the traditional inverted file, and the latter is a kind of decomposition of the traditional PAT array. Both organize text-indexing information in the form of adjacency matrix. Query algorithms for the new models are developed and performance comparisons between the new models and the traditional models are carried out. The new models can improve query-processing efficiency considerably at the cost of much less amount of extra storage overhead compared to the size of original text database, so are suitable for applications of large-scale text databases, especially Chinese text databases.