Full text indexing based on lexical relations an application: software libraries
SIGIR '89 Proceedings of the 12th annual international ACM SIGIR conference on Research and development in information retrieval
Document filtering for fast ranking
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Query evaluation: strategies and optimizations
Information Processing and Management: an International Journal
Filtered document retrieval with frequency-sorted indexes
Journal of the American Society for Information Science
Optimization of inverted vector searches
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Compressed inverted files with reduced decoding overheads
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Proceedings of the Tenth International Conference on Data Engineering
Impact transformation: effective and efficient web retrieval
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Pruning long documents for distributed information retrieval
Proceedings of the eleventh international conference on Information and knowledge management
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Text joins in an RDBMS for web data integration
WWW '03 Proceedings of the 12th international conference on World Wide Web
HAT: a hardware assisted TOP-DOC inverted index component
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Content-based retrieval in hybrid peer-to-peer networks
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Character N-Gram Tokenization for European Language Text Retrieval
Information Retrieval
Improving Web search efficiency via a locality based static pruning method
WWW '05 Proceedings of the 14th international conference on World Wide Web
Three-level caching for efficient query processing in large Web search engines
WWW '05 Proceedings of the 14th international conference on World Wide Web
Using contextual spelling correction to improve retrieval effectiveness in degraded text collections
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Beyond PageRank: machine learning for static ranking
Proceedings of the 15th international conference on World Wide Web
Pruned query evaluation using pre-computed impacts
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A document-centric approach to static index pruning in text retrieval systems
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Pruning strategies for mixed-mode querying
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
A fast and robust method for web page template detection and removal
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
Hybrid global-local indexing for effcient peer-to-peer information retrieval
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Using query logs to establish vocabularies in distributed information retrieval
Information Processing and Management: an International Journal
Efficient document retrieval in main memory
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Pruning policies for two-tiered inverted index with correctness guarantee
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A time machine for text search
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Boosting static pruning of inverted files
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Optimized query execution in large search engines with global page ordering
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Top-k query evaluation with probabilistic guarantees
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
High performance index build algorithms for intranet search engines
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Effective top-k computation in retrieving structured documents with term-proximity support
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Dynamic index pruning for effective caching
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Metadata harvesting for content-based distributed information retrieval
Journal of the American Society for Information Science and Technology
Authority-based keyword search in databases
ACM Transactions on Database Systems (TODS)
Locality-Based pruning methods for web search
ACM Transactions on Information Systems (TOIS)
Global term weights in distributed environments
Information Processing and Management: an International Journal
Query-based partitioning of documents and indexes for information lifecycle management
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Supporting personalized ranking over categorical attributes
Information Sciences: an International Journal
Site-based dynamic pruning for query processing in search engines
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
TinyLex: static n-gram index pruning with perfect recall
Proceedings of the 17th ACM conference on Information and knowledge management
Top-k aggregation using intersections of ranked inputs
Proceedings of the Second ACM International Conference on Web Search and Data Mining
A Practitioner's Guide for Static Index Pruning
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Entropy-Based Static Index Pruning
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Effective top-k computation with term-proximity support
Information Processing and Management: an International Journal
Independent informative subgraph mining for graph information retrieval
Proceedings of the 18th ACM conference on Information and knowledge management
Exploiting query views for static index pruning in web search engines
Proceedings of the 18th ACM conference on Information and knowledge management
Probabilistic static pruning of inverted files
ACM Transactions on Information Systems (TOIS)
Revisiting globally sorted indexes for efficient document retrieval
Proceedings of the third ACM international conference on Web search and data mining
Static pruning of terms in inverted files
ECIR'07 Proceedings of the 29th European conference on IR research
Light syntactically-based index pruning for information retrieval
ECIR'07 Proceedings of the 29th European conference on IR research
Efficient text proximity search
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
A statistical view of binned retrieval models
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Ranking under temporal constraints
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Engineering basic algorithms of an in-memory text search engine
ACM Transactions on Information Systems (TOIS)
Exploiting index pruning methods for clustering XML collections
INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval
Term frequency quantization for compressing an inverted index
AMT'10 Proceedings of the 6th international conference on Active media technology
ICWE'10 Proceedings of the 10th international conference on Current trends in web engineering
Managing misspelled queries in IR applications
Information Processing and Management: an International Journal
Within-document term-based index pruning with statistical hypothesis testing
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
A cascade ranking model for efficient ranked retrieval
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Timestamp-based result cache invalidation for web search engines
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Faster top-k document retrieval using block-max indexes
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Factorization-based lossless compression of inverted indices
Proceedings of the 20th ACM international conference on Information and knowledge management
When close enough is good enough: approximate positional indexes for efficient ranked retrieval
Proceedings of the 20th ACM international conference on Information and knowledge management
ACM Transactions on Information Systems (TOIS)
High-performance processing of text queries with tunable pruned term and term pair indexes
ACM Transactions on Information Systems (TOIS)
Indexing shared content in information retrieval systems
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Optimized top-k processing with global page scores on block-max indexes
Proceedings of the fifth ACM international conference on Web search and data mining
XML retrieval using pruned element-index files
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Latent argumentative pruning for compact MEDLINE indexing
AIME'05 Proceedings of the 10th conference on Artificial Intelligence in Medicine
Harvesting for full-text retrieval
ICADL'05 Proceedings of the 8th international conference on Asian Digital Libraries: implementing strategies and sharing experiences
Index ordering by query-independent measures
Information Processing and Management: an International Journal
Cache-Based Query Processing for Search Engines
ACM Transactions on the Web (TWEB)
Information preservation in static index pruning
Proceedings of the 21st ACM international conference on Information and knowledge management
Optimizing top-k document retrieval strategies for block-max indexes
Proceedings of the sixth ACM international conference on Web search and data mining
An information-theoretic account of static index pruning
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Permutation indexing: fast approximate retrieval from large corpora
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
A Fast Static Index Pruning Algorithm
Proceedings of the Second International Conference on Innovative Computing and Cloud Computing
Document vector representations for feature extraction in multi-stage document ranking
Information Retrieval
Efficient entity matching using materialized lists
Information Sciences: an International Journal
Hi-index | 0.00 |
We introduce static index pruning methods that significantly reduce the index size in information retrieval systems.We investigate uniform and term-based methods that each remove selected entries from the index and yet have only a minor effect on retrieval results. In uniform pruning, there is a fixed cutoff threshold, and all index entries whose contribution to relevance scores is bounded above by a given threshold are removed from the index. In term-based pruning, the cutoff threshold is determined for each term, and thus may vary from term to term. We give experimental evidence that for each level of compression, term-based pruning outperforms uniform pruning, under various measures of precision. We present theoretical and experimental evidence that under our term-based pruning scheme, it is possible to prune the index greatly and still get retrieval results that are almost as good as those based on the full index.