Federated database systems for managing distributed, heterogeneous, and autonomous databases
ACM Computing Surveys (CSUR) - Special issue on heterogeneous databases
Interoperability of multiple autonomous databases
ACM Computing Surveys (CSUR) - Special issue on heterogeneous databases
Automating the assignment of submitted manuscripts to reviewers
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
On the Consecutive-Retrieval Problem
SIAM Journal on Computing
Query processing in multidatabase systems
Modern database systems
Incremental updates of inverted lists for text document retrieval
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Query processing in a system for distributed databases (SDD-1)
ACM Transactions on Database Systems (TODS)
File organization: the consecutive retrieval property
Communications of the ACM
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Design of an Integrated Information Retrieval/Database Management System
IEEE Transactions on Knowledge and Data Engineering
A Theory of Translation From Relational Queries to Hierarchical Queries
IEEE Transactions on Knowledge and Data Engineering
Translation of Object-Oriented Queries to Relational Queries
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Query Optimization in a Heterogeneous DBMS
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Efficient processing of joins on set-valued attributes
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Querying web metadata: Native score management and text support in databases
ACM Transactions on Database Systems (TODS)
Region clustering based evaluation of multiple top-N selection queries
Data & Knowledge Engineering
Effective early termination techniques for text similarity join operator
ISCIS'05 Proceedings of the 20th international conference on Computer and Information Sciences
Hi-index | 0.00 |
When a multidatabase system contains textual database systems (i.e., information retrieval systems), queries against the global schema of the multidatabase system may contain a new type of joins驴joins between attributes of textual type. Three algorithms for processing such a type of joins are presented and their I/O costs are analyzed in this paper. Since such a type of joins often involves document collections of very large size, it is very important to find efficient algorithms to process them. The three algorithms differ on whether the documents themselves or the inverted files on the documents are used to process the join. Our analysis and the simulation results indicate that the relative performance of these algorithms depends on the input document collections, system characteristics, and the input query. For each algorithm, the type of input document collections with which the algorithm is likely to perform well is identified. An integrated algorithm that automatically selects the best algorithm to use is also proposed.