Pivoted document length normalization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Extended Boolean information retrieval
Communications of the ACM
Static index pruning for information retrieval systems
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic models of information retrieval based on measuring the divergence from randomness
ACM Transactions on Information Systems (TOIS)
Using contextual spelling correction to improve retrieval effectiveness in degraded text collections
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Hi-index | 0.00 |
PURPOSE: We evaluate how argumentation in scientific articles can be used to propose an original index pruning strategy, which significantly reduce the size of the engine's indexes but having a limited impact on retrieval effectiveness. METHODS: A Bayesian classifier trained on explicitly structured MEDLINE abstracts generates these argumentative categories. The categories are used to generate four different argumentative indexes. A fifth index contains the complete abstract, together with the title and the list of Medical Subject Headings (MeSH) terms. This last index is used as baseline to compare results obtained when only a specific argumentative index is retrieved. RESULTS and CONCLUSION: When titles and medical subject headings are also stored in the respective indexes, querying PURPOSE and CONCLUSION indexes can respectively achieves 78.4% and 74.3% of the baseline, while the size if the index is divided by two. It is concluded that argumentation can be a powerful index pruning strategy in complement to more traditionnal approaches.