Multiattribute hashing using Gray codes

Authors:
Christos Faloutsos
Affiliations:
Dept of Computer Science, University of Maryland, College Park, MD
Venue:
SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
Year:
1986

Citing 13
Cited 33

Gray Codes for Partial Match and Range Queries

IEEE Transactions on Software Engineering
The Grid File: An Adaptable, Symmetric Multikey File Structure

ACM Transactions on Database Systems (TODS)
Performance analysis of linear hashing with partial expansions

ACM Transactions on Database Systems (TODS)
Partial-match retrieval using hashing and descriptors

ACM Transactions on Database Systems (TODS)
Optimal partial-match retrieval when fields are independently specified

ACM Transactions on Database Systems (TODS)
Extendible hashing—a fast access method for dynamic files

ACM Transactions on Database Systems (TODS)
Analysis and performance of inverted data base structures

Communications of the ACM
Attribute based file organization in a paged memory environment

Communications of the ACM
Multidimensional binary search trees used for associative searching

Communications of the ACM
The K-D-B-tree: a search structure for large multidimensional dynamic indexes

SIGMOD '81 Proceedings of the 1981 ACM SIGMOD international conference on Management of data
A class of data structures for associative searching

PODS '84 Proceedings of the 3rd ACM SIGACT-SIGMOD symposium on Principles of database systems
Spiral Storage: Incrementally Augmentable Hash Addressed Storage

Spiral Storage: Incrementally Augmentable Hash Addressed Storage
Combinatorial Algorithms: Theory and Practice

Combinatorial Algorithms: Theory and Practice

Clustered multiattribute hash files

PODS '89 Proceedings of the eighth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Linear clustering of objects with multiple attributes

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Dynamic partitioning of signature files

ACM Transactions on Information Systems (TOIS)
Data structures for efficient broker implementation

ACM Transactions on Information Systems (TOIS)
Multidimensional access methods

ACM Computing Surveys (CSUR)
Snakes and sandwiches: optimal clustering strategies for a data warehouse

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Multidimensional Index Structures in Relational Databases

Journal of Intelligent Information Systems - Data warehousing and knowledge discovery
Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases

ACM Computing Surveys (CSUR)
Scalability Analysis of Declustering Methods for Multidimensional Range Queries

IEEE Transactions on Knowledge and Data Engineering
Analysis of the Clustering Properties of the Hilbert Space-Filling Curve

IEEE Transactions on Knowledge and Data Engineering
PROBE Spatial Data Modeling and Query Processing in an Image Database Application

IEEE Transactions on Software Engineering
Parallel Algorithms for High-dimensional Similarity Joins for Data Mining Applications

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Concurrent access to point data

COMPSAC '97 Proceedings of the 21st International Computer Software and Applications Conference
Implementation of Multidimensional Index Structures for Knowledge Discovery in Relational Databases

DaWaK '99 Proceedings of the First International Conference on Data Warehousing and Knowledge Discovery
Approximate k -Closest-Pairs with Space Filling Curves

DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Speeding up construction of PMR quadtree-based spatial indexes

The VLDB Journal — The International Journal on Very Large Data Bases
Outlier Mining in Large High-Dimensional Data Sets

IEEE Transactions on Knowledge and Data Engineering
Neighbor-finding based on space-filling curves

Information Systems
An approximate algorithm for top-k closest pairs join query in large high dimensional data

Data & Knowledge Engineering
Transform-Space View: Performing Spatial Join in the Transform Space Using Original-Space Indexes

IEEE Transactions on Knowledge and Data Engineering
A scientific database system for polymers and materials engineering needs

SSDBM'1994 Proceedings of the 7th international conference on Scientific and Statistical Database Management
CLAM: concurrent location management for moving objects

Proceedings of the 15th annual ACM international symposium on Advances in geographic information systems
A cache invalidation scheme for continuous partial match queries in mobile computing environments

Distributed and Parallel Databases
High-dimensional descriptor indexing for large multimedia databases

Proceedings of the 17th ACM conference on Information and knowledge management
Gray code chaining: a high performance hashing algorithm for limited storage applications

International Journal of High Performance Systems Architecture
Reordering columns for smaller indexes

Information Sciences: an International Journal
P2P-based multidimensional indexing methods: A survey

Journal of Systems and Software
Variable granularity space filling curve for indexing multidimensional data

ADBIS'11 Proceedings of the 15th international conference on Advances in databases and information systems
Multi-attribute hashing of wireless data for content-based queries

ICDCIT'05 Proceedings of the Second international conference on Distributed Computing and Internet Technology
On the optimality of clustering properties of space filling curves

PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
Group-Scope query and its access method

APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Reordering rows for better compression: Beyond the lexicographic order

ACM Transactions on Database Systems (TODS)
Approximate covering detection among content-based subscriptions using space filling curves

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multiattribute hashing and its variations have been proposed for partial match and range queries in the past. The main idea is that each record yields a bitstring @@@@ (“record signature”), according to the values of its attributes. The binary value (@@@@)2 of this string decides the bucket that the record is stored. In this paper we propose to use Gray codes instead of binary codes, in order to map record signatures to buckets. In Gray codes, successive codewords differ in the value of exactly one bit position, thus, successive buckets hold records with similar record signatures. The proposed method achieves better clustering of similar records and avoids some of the (expensive) random disk accesses, replacing them with sequential ones. We develop a mathematical model, derive formulas giving the average performance of both methods and show that the proposed method achieves 0% - 50% relative savings over the binary codes. We also discuss how Gray codes could be applied to some retrieval methods designed for range queries, such as the grid file [Nievergelt84a] and the approach based on the so-called z-ordering [Orenstein84a].