Chinese text segmentation for text retrieval: achievements and problems
Journal of the American Society for Information Science
A new character-based indexing method using frequency data for Japanese documents
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Comparing representations in Chinese information retrieval
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric
Journal of the ACM (JACM)
Signature files: an access method for documents and its analytical performance evaluation
ACM Transactions on Information Systems (TOIS)
Modern Information Retrieval
Hi-index | 0.00 |
Text retrieval systems require an index to allow fast access to documents at the cost of some storage overhead. This paper proposes a novel full-text indexing model for Chinese text retrieval based on the concept of adjacency matrix of directed graph. Using this indexing model, retrieval systems need to keep only indexing data, rather than indexing data and original text data as the traditional retrieval systems do, thus system space cost as a whole can be reduced drastically while retrieval efficiency is maintained satisfactory. Experiments over five real-world Chinese text collections are carried out to demonstrate the effectiveness and efficiency of this model.