Versioning a full-text information retrieval system
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Query evaluation: strategies and optimizations
Information Processing and Management: an International Journal
String editing and longest common subsequences
Handbook of formal languages, vol. 2
Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
Building a distributed full-text index for the Web
Proceedings of the 10th international conference on World Wide Web
ACM Transactions on Internet Technology (TOIT)
Modern Information Retrieval
Database System Implementation
Database System Implementation
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Efficient single-pass index construction for text databases
Journal of the American Society for Information Science and Technology
Compressing and searching XML data via two zips
Proceedings of the 15th international conference on World Wide Web
Indexing shared content in information retrieval systems
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Efficient search in large textual collections with redundancy
Proceedings of the 16th international conference on World Wide Web
A time machine for text search
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
FluxCapacitor: efficient time-travel text search
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Inverted index compression and query processing with optimized document ordering
Proceedings of the 18th international conference on World wide web
Optimizing complex extraction programs over evolving text data
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Compact full-text indexing of versioned document collections
Proceedings of the 18th ACM conference on Information and knowledge management
Leveraging temporal dynamics of document content in relevance ranking
Proceedings of the third ACM international conference on Web search and data mining
Durable top-k search in document archives
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Improved index compression techniques for versioned document collections
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Hybrid index structures for temporal-textual web search
APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
Temporal index sharding for space-time efficiency in archive search
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Faster temporal range queries over versioned text
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Index maintenance for time-travel text search
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Optimizing positional index structures for versioned document collections
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
Many information systems keep multiple versions of documents. Examples include content management systems, version control systems (e.g. ClearCase, CVS), Wikis, and backup and archiving solutions. Often, it is desired to enable free-text search over such repositories, i.e. to enable submitting queries that may match any version of any document. We propose an indexing method that takes advantage of the inherent redundancy present in versioned documents by solving a variant of the multiple sequence alignment problem. The scheme produces an index that is much more compact than a standard index that treats each version independently. In experiments over publicly available versioned data, our method achieved compaction ratios of 81% as compared with standard indexing, while supporting the same retrieval capabilities.