ACM Computing Surveys (CSUR) - Annals of discrete mathematics, 24
A loosely-coupled integration of a text retrieval system and an object-oriented database system
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Synthetic workload performance analysis of incremental updates
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Quickly generating billion-record synthetic databases
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Document processing in a relational database system
ACM Transactions on Information Systems (TOIS)
The "DGX" distribution for mining massive, skewed data
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Benchmark Handbook: For Database and Transaction Processing Systems
Benchmark Handbook: For Database and Transaction Processing Systems
Information Retrieval: Computational and Theoretical Aspects
Information Retrieval: Computational and Theoretical Aspects
Modern Information Retrieval
Performance of Inverted Indices in Distributed Text Document Retrieval Systems
PDIS '93 Proceedings of the 2nd International Conference on Parallel and Distributed Information Systems
Extended User-Defined Indexing with Application to Textual Databases
VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
On B-Tree Indices for Skewed Distributions
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Fast Incremental Indexing for Full-Text Information Retrieval
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Execution Performance Issues in Full-Text Information Retrieval
Execution Performance Issues in Full-Text Information Retrieval
Alternative data storage solution for mobile messaging services
Mobile Information Systems
Benchmarking Fulltext Search Performance of RDF Stores
ESWC 2009 Heraklion Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications
Accessing speech documents on smartphones
Proceedings of the 5th Annual International Conference on Mobile and Ubiquitous Systems: Computing, Networking, and Services
Searching web data: An entity retrieval and high-performance indexing model
Web Semantics: Science, Services and Agents on the World Wide Web
A performance comparison of parallel DBMSs and MapReduce on large-scale text analytics
Proceedings of the 16th International Conference on Extending Database Technology
UpSizeR: Synthetically scaling an empirical relational database
Information Systems
Hi-index | 0.00 |
We introduce a benchmark called TEXTURE (TEXT Under RElations) to measure the relative strengths and weaknesses of combining text processing with a relational workload in an RDBMS. While the well-known TREC benchmarks focus on quality, we focus on efficiency. TEXTURE is a micro-benchmark for query workloads, and considers two central text support issues that previous benchmarks did not: (1) queries with relevance ranking, rather than those that just compute all answers, and (2) a richer mix of text and relational processing, reflecting the trend toward seamless integration. In developing this benchmark, we had to address the problem of generating large text collections that reflected the (performance) characteristics of a given "seed" collection; this is essential for a controlled study of specific data characteristics and their effects on performance. In addition to presenting the benchmark, with performance numbers for three commercial DBMSs, we present and validate a synthetic generator for populating text fields.