Parallel free-text search on the connection machine system
Communications of the ACM - Special issue on parallelism
An updated table of minimum-distance bounds for binary linear codes
IEEE Transactions on Information Theory
Partitioned signature files: design issues and performance evaluation
ACM Transactions on Information Systems (TOIS)
Dynamic partitioning of signature files
ACM Transactions on Information Systems (TOIS)
Disk Allocation Methods Using Error Correcting Codes
IEEE Transactions on Computers
Information retrieval
Parallel database systems: the future of high performance database systems
Communications of the ACM
Frame-sliced partitioned parallel signature files
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Estimating accesses in partitioned signature file organizations
ACM Transactions on Information Systems (TOIS)
Optimal disk allocation for partial match queries
ACM Transactions on Database Systems (TODS)
Selecting signature files for specific applications
Information Processing and Management: an International Journal
Concurrent frame signature files
Distributed and Parallel Databases
Optimal signature extraction and information loss
ACM Transactions on Database Systems (TODS)
S-tree: a dynamic balanced signature index for office retrieval
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
Disk allocation for Cartesian product files on multiple-disk systems
ACM Transactions on Database Systems (TODS)
Parallel searching for binary Cartesian product files
CSC '85 Proceedings of the 1985 ACM thirteenth annual conference on Computer Science
Signature files: an access method for documents and its analytical performance evaluation
ACM Transactions on Information Systems (TOIS)
IEEE Transactions on Knowledge and Data Engineering
A Partitioned Signature File Structure for Multiattribute and Text Retrieval
Proceedings of the Sixth International Conference on Data Engineering
A Word-Parallel, Bit-Serial Signature Processor for Superimposed Coding
Proceedings of the Second International Conference on Data Engineering
Hamming Filters: A Dynamic Signature File Organization for Parallel Stores
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
An associative/parallel processor for partial match retrieval using superimposed codes
ISCA '80 Proceedings of the 7th annual symposium on Computer Architecture
Key-based partitioned bit-sliced signature file
ACM SIGIR Forum
Inverted files versus signature files for text indexing
ACM Transactions on Database Systems (TODS)
Signature files and signature trees
Information Processing Letters
Signature-based structures for objects with set-valued attributes
Information Systems - Databases: Creation, management and utilization
Interactive-Time Similarity Search for Large Image Collections Using Parallel VA-Files
ECDL '00 Proceedings of the 4th European Conference on Research and Advanced Technology for Digital Libraries
Iterative-improvement-based declustering heuristics for multi-disk databases
Information Systems
On the Signature Trees and Balanced Signature Trees
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Inverted files for text search engines
ACM Computing Surveys (CSUR)
On the cost of searching signature trees
Information Processing Letters
On the Signature Tree Construction and Analysis
IEEE Transactions on Knowledge and Data Engineering
On the SD-tree construction for optimal signature operations
COMPUTE '08 Proceedings of the 1st Bangalore Annual Compute Conference
On the general signature trees
DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
Hi-index | 0.00 |
Access methods based on signature files can largely benefit from possibilities offered by parallel environments. To this end, an effective declustering strategy that would distribute signatures over a set of parallel independent disks has to be combined with a synergic clustering which is employed to avoid searching the whole signature file while executing a query. This article proposes two parallel signature file organizations, Hamming Filter (HF) and Hamming + Filter (H+F), whose common declustering strategy is based on error correcting codes, and where clustering is achieved by organizing signatures into fixed-size buckets, each containing signatures sharing the same key value. HF allocates signatures on disks in a static way and works well if a correct relationship holds between the parameters of the code and the size of the file. H+F is a generalization of HF suitable to manage highly dynamic files. It uses a dynamic declustering, obtained through a sequence of codes, and organizes a smooth migration of signatures between disks so that high performance levels are retained regardless of current file size. Theoretical analysis characterizes the best-case, expected, and worst-case behaviors of these organizations. Analytical results are verified by experiments on prototype systems.