The vocabulary problem in human-system communication
Communications of the ACM
Information retrieval: data structures and algorithms
Information retrieval: data structures and algorithms
OHSUMED: an interactive retrieval evaluation and new large test collection for research
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic thesaurus generation for an electronic community system
Journal of the American Society for Information Science
Proceedings of the first ACM international conference on Digital libraries
Improved hierarchical bit-vector compression in document retrieval systems
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
Journal of the American Society for Information Science
Modern Information Retrieval
Hi-index | 0.00 |
The vocabulary problem in information retrieval arises because authors and indexers often use different terms for the same concept. A thesaurus defines mappings between different but related terms. It is widely used in modern information retrieval systems to solve the vocabulary problem. Chen et al. proposed the concept space approach to automatic thesaurus construction.A concept space contains the associations between every pair of terms. Prev ious research studies show that concept space is a useful tool for helping information searchers in revising their queries in order to get better results from information retrieval systems. The construction of a concept space, however, is very computationally intensive. In this paper, we propose and evaluate an efficient algorithm for the incremental update of concept spaces. In our model, only strong associations are maintained, since they are most useful in thesauri construction. Our algorithm uses a pruning technique to avoid computing weak associations to achieve efficiency.