Complete inverted files for efficient text retrieval and analysis
Journal of the ACM (JACM)
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
A vector space model for automatic indexing
Communications of the ACM
Journal of Algorithms
Succinct representations of lcp information and improvements in the compressed suffix arrays
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Efficient algorithms for document retrieval problems
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Succinct Representation of Balanced Parentheses and Static Trees
SIAM Journal on Computing
LATIN '00 Proceedings of the 4th Latin American Symposium on Theoretical Informatics
Color Set Size Problem with Application to String Matching
CPM '92 Proceedings of the Third Annual Symposium on Combinatorial Pattern Matching
Optimal suffix tree construction with large alphabets
FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
New text indexing functionalities of the compressed suffix arrays
Journal of Algorithms
Journal of the ACM (JACM)
Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching
SIAM Journal on Computing
Faster index for property matching
Information Processing Letters
Optimal prefix and suffix queries on texts
Information Processing Letters
Self-indexing Natural Language
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
Errata for “Faster index for property matching”
Information Processing Letters
Faster entropy-bounded compressed suffix trees
Theoretical Computer Science
Efficient Algorithms for Two Extensions of LPF Table: The Power of Suffix Arrays
SOFSEM '10 Proceedings of the 36th Conference on Current Trends in Theory and Practice of Computer Science
The compressed permuterm index
ACM Transactions on Algorithms (TALG)
Compression, indexing, and retrieval for massive string data
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
On space efficient two dimensional range minimum data structures
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part II
Top-k ranked document search in general text databases
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part II
Finding Patterns In Given Intervals
Fundamenta Informaticae
Compressed self-indices supporting conjunctive queries on document collections
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
String retrieval for multi-pattern queries
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Colored range queries and document retrieval
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Extracting powers and periods in a string from its runs structure
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Practical compressed document retrieval
SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Counting colours in compressed strings
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Efficient seeds computation revisited
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Improved compressed indexes for full-text document retrieval
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Space-Efficient Preprocessing Schemes for Range Minimum Queries on Static Arrays
SIAM Journal on Computing
Word-based self-indexes for natural language text
ACM Transactions on Information Systems (TOIS)
Top-k document retrieval in optimal time and linear space
Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
Optimal succinctness for range minimum queries
LATIN'10 Proceedings of the 9th Latin American conference on Theoretical Informatics
Top-K color queries for document retrieval
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Efficient algorithms for three variants of the LPF table
Journal of Discrete Algorithms
New algorithms on wavelet trees and applications to information retrieval
Theoretical Computer Science
Improved algorithms for the range next value problem and applications
Theoretical Computer Science
LATIN'12 Proceedings of the 10th Latin American international conference on Theoretical Informatics
Efficient in-memory top-k document retrieval
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Space-Efficient top-k document retrieval
SEA'12 Proceedings of the 11th international conference on Experimental Algorithms
Towards an optimal space-and-query-time index for top-k document retrieval
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Document listing for queries with excluded pattern
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Cross-Document pattern matching
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Finding patterns in given intervals
MFCS'07 Proceedings of the 32nd international conference on Mathematical Foundations of Computer Science
Space-efficient algorithms for document retrieval
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Two dimensional range minimum queries and fibonacci lattices
ESA'12 Proceedings of the 20th Annual European conference on Algorithms
Improved compressed indexes for full-text document retrieval
Journal of Discrete Algorithms
Journal of Discrete Algorithms
Colored range queries and document retrieval
Theoretical Computer Science
Better space bounds for parameterized range majority and minority
WADS'13 Proceedings of the 13th international conference on Algorithms and Data Structures
Spaces, Trees, and Colors: The algorithmic landscape of document retrieval on sequences
ACM Computing Surveys (CSUR)
Indexing Word Sequences for Ranked Retrieval
ACM Transactions on Information Systems (TOIS)
Multi-pattern matching with bidirectional indexes
Journal of Discrete Algorithms
Cross-document pattern matching
Journal of Discrete Algorithms
Hi-index | 0.00 |
We propose succinct data structures for text retrieval systems supporting document listing queries and ranking queries based on the tf*idf (term frequency times inverse document frequency) scores of documents. Traditional data structures for these problems support queries only for some predetermined keywords. Recently Muthukrishnan proposed a data structure for document listing queries for arbitrary patterns at the cost of data structure size. For computing the tf*idf scores there has been no efficient data structures for arbitrary patterns. Our new data structures support these queries using small space. The space is only 2/@e times the size of compressed documents plus 10n bits for a document collection of length n, for any 0