Multiattribute hashing using Gray codes
SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
Principles of database and knowledge-base systems, Vol. I
Principles of database and knowledge-base systems, Vol. I
File organization for database design
File organization for database design
Optimization for dynamic inverted index maintenance
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Distributed indexing: a scalable mechanism for distributed information retrieval
SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
Content routing for distributed information servers
EDBT '94 Proceedings of the 4th international conference on extending database technology: Advances in database technology
The effectiveness of GIOSS for the text database discovery problem
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Incremental updates of inverted lists for text document retrieval
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Searching distributed collections with inference networks
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
A general solution of the n-dimensional B-tree problem
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Performance issues in distributed shared-nothing information-retrieval systems
Information Processing and Management: an International Journal
The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
The Grid File: An Adaptable, Symmetric Multikey File Structure
ACM Transactions on Database Systems (TODS)
Optimal partial-match retrieval when fields are independently specified
ACM Transactions on Database Systems (TODS)
Precision and recall of GIOSS estimators for database discovery
PDIS '94 Proceedings of the third international conference on on Parallel and distributed information systems
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
A class of data structures for associative searching
PODS '84 Proceedings of the 3rd ACM SIGACT-SIGMOD symposium on Principles of database systems
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Internet Resource Discovery Services
Computer
A New Algorithm for Computing Joins with Grid Files
Proceedings of the Ninth International Conference on Data Engineering
The R+-Tree: A Dynamic Index for Multi-Dimensional Objects
VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
An Efficient Indexing Technique for Full Text Databases
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Fast Incremental Indexing for Full-Text Information Retrieval
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Database selection techniques for routing bibliographic queries
Proceedings of the third ACM conference on Digital libraries
GlOSS: text-source discovery over the Internet
ACM Transactions on Database Systems (TODS)
Ontological Approach for Information Discovery in Internet Databases
Distributed and Parallel Databases
Distributed resource discovery: using z39.50 to build cross-domain information servers
Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Metrics for evaluating database selection techniques
World Wide Web
WebFindIt: An Architecture and System for Querying Web Databases
IEEE Internet Computing
Supporting Dynamic Interactions among Web-Based Information Sources
IEEE Transactions on Knowledge and Data Engineering
Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Comparing the performance of collection selection algorithms
ACM Transactions on Information Systems (TOIS)
Adaptive web-based database communities
Information modeling for internet applications
Query-driven document partitioning and collection selection
InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
A heuristic approach to network optimised mapping of a distributed resource discovery architecture
International Journal of Computer Applications in Technology
Dynamic adaptation of multi-key index for distributed database system
ICCOMP'05 Proceedings of the 9th WSEAS International Conference on Computers
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
With the profusion of text databases on the Internet, it is becoming increasingly hard to find the most useful databases for a given query. To attack this problem, several existing and proposed systems employ brokers to direct user queries, using a local database of summary information about the available databases. This summary information must effectively distinguish relevant databases and must be compact while allowing efficient access. We offer evidence that one broker, GlOSS, can be effective at locating databases of interest even in a system of hundreds of databased and can examine the performance of accessing theGlOSS summeries for two promising storage methods: the grid file and partitioned hashing. We show that both methods can be tuned to provide good performance for a particular workload (within a broad range of workloads), and we discuss the tradeoffs between the two data structures. As a side effect of our work, we show that grid files are more broadly applicable than previously thought; inparticular, we show that by varying the policies used to construct the grid file we can provide good performance for a wide range of workloads even when storing highly skewed data.