The TEXTURE benchmark: measuring performance of text queries on a relational DBMS

Authors:
Vuk Ercegovac;David J. DeWitt;Raghu Ramakrishnan
Affiliations:
University of Wisconsin, Madison, WI;University of Wisconsin, Madison, WI;University of Wisconsin, Madison, WI
Venue:
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Year:
2005

Citing 14
Cited 6

Access methods for text

ACM Computing Surveys (CSUR) - Annals of discrete mathematics, 24
A loosely-coupled integration of a text retrieval system and an object-oriented database system

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Synthetic workload performance analysis of incremental updates

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Quickly generating billion-record synthetic databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Document processing in a relational database system

ACM Transactions on Information Systems (TOIS)
The "DGX" distribution for mining massive, skewed data

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Benchmark Handbook: For Database and Transaction Processing Systems

Benchmark Handbook: For Database and Transaction Processing Systems
Information Retrieval: Computational and Theoretical Aspects

Information Retrieval: Computational and Theoretical Aspects
Modern Information Retrieval

Modern Information Retrieval
Performance of Inverted Indices in Distributed Text Document Retrieval Systems

PDIS '93 Proceedings of the 2nd International Conference on Parallel and Distributed Information Systems
Extended User-Defined Indexing with Application to Textual Databases

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
On B-Tree Indices for Skewed Distributions

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Fast Incremental Indexing for Full-Text Information Retrieval

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Execution Performance Issues in Full-Text Information Retrieval

Execution Performance Issues in Full-Text Information Retrieval

Alternative data storage solution for mobile messaging services

Mobile Information Systems
Benchmarking Fulltext Search Performance of RDF Stores

ESWC 2009 Heraklion Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications
Accessing speech documents on smartphones

Proceedings of the 5th Annual International Conference on Mobile and Ubiquitous Systems: Computing, Networking, and Services
Searching web data: An entity retrieval and high-performance indexing model

Web Semantics: Science, Services and Agents on the World Wide Web
A performance comparison of parallel DBMSs and MapReduce on large-scale text analytics

Proceedings of the 16th International Conference on Extending Database Technology
UpSizeR: Synthetically scaling an empirical relational database

Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We introduce a benchmark called TEXTURE (TEXT Under RElations) to measure the relative strengths and weaknesses of combining text processing with a relational workload in an RDBMS. While the well-known TREC benchmarks focus on quality, we focus on efficiency. TEXTURE is a micro-benchmark for query workloads, and considers two central text support issues that previous benchmarks did not: (1) queries with relevance ranking, rather than those that just compute all answers, and (2) a richer mix of text and relational processing, reflecting the trend toward seamless integration. In developing this benchmark, we had to address the problem of generating large text collections that reflected the (performance) characteristics of a given "seed" collection; this is essential for a controlled study of specific data characteristics and their effects on performance. In addition to presenting the benchmark, with performance numbers for three commercial DBMSs, we present and validate a synthetic generator for populating text fields.