SimDB: a similarity-aware database system

Authors:
Yasin N. Silva;Ahmed M. Aly;Walid G. Aref;Per-Ake Larson
Affiliations:
Purdue University, West Lafayette, IN, USA;Purdue University, West Lafayette, IN, USA;Purdue University, West Lafayette, IN, USA;Microsoft Research, Redmond, WA, USA
Venue:
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Year:
2010

Citing 7
Cited 4

Epsilon grid order: an algorithm for the similarity join on massive high-dimensional data

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A Primitive Operator for Similarity Joins in Data Cleaning

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Metric space similarity joins

ACM Transactions on Database Systems (TODS)
Efficient EMD-based similarity search in multimedia databases via flexible dimensionality reduction

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Cost-based variable-length-gram selection for string collections to support approximate queries efficiently

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Similarity Search in Arbitrary Subspaces Under Lp-Norm

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Similarity Group-By

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering

Spatial queries with two kNN predicates

Proceedings of the VLDB Endowment
Aggregating and disaggregating flexibility objects

SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Exploiting database similarity joins for metric spaces

Proceedings of the VLDB Endowment
Similarity queries: their conceptual evaluation, transformations, and processing

The VLDB Journal — The International Journal on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

The identification and processing of similarities in the data play a key role in multiple application scenarios. Several types of similarity-aware operations have been studied in the literature. However, in most of the previous work, similarity-aware operations are studied in isolation from other regular or similarity-aware operations. Furthermore, most of the previous research in the area considers a standalone implementation, i.e., without any integration with a database system. In this demonstration we present SimDB, a similarity-aware database management system. SimDB supports multiple similarity-aware operations as first-class database operators. We describe the architectural changes to implement the similarity-aware operators. In particular, we present the way conventional operators' implementation machinery is extended to support similarity-aware operators. We also show how these operators interact with other similarity-aware and regular operators. In particular, we show the effectiveness of multiple equivalence rules that can be used to extend cost-based query optimization to the case of similarity-ware operations.