Foundations of deductive databases and logic programming
Query caching and optimization in distributed mediator systems
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Query reformulation for dynamic information integration
Journal of Intelligent Information Systems - Special issue on intelligent integration of information
Sound and efficient closed-world reasoning for planning
Artificial Intelligence
Answering recursive queries using views
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
An overview of query optimization in relational systems
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Building regression cost models for multidatabase systems
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Principles of Database and Knowledge-Base Systems: Volume II: The New Technologies
Principles of Database and Knowledge-Base Systems: Volume II: The New Technologies
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Obtaining Complete Answers from Incomplete Databases
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Querying Heterogeneous Information Sources Using Source Descriptions
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Query optimization using local completeness
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
MIKS: an agent framework supporting information access and integration
Intelligent information agents
Hi-index | 0.00 |
In this paper we describe two optimization techniques that are specially tailored for information gathering. The first is a greedy minimization algorithm that minimizes an information gathering plan by removing redundant and overlapping information sources without loss of completeness. We then discuss a set of heuristics that guide the greedy minimization algorithm so as to remove costlier information sources first. In contrast to previous work, our approach can handle recursive query plans that arise commonly in practice. Second, we present a method for ordering the access to sources to reduce the execution cost. Sources on the Internet have a variety of access limitations and the execution cost in information gathering is affected both by network traffic and by the connection setup costs. We describe a way of representing the access capabilities of sources, and provide a greedy algorithm for ordering source calls that respects source limitations, and takes both access costs and traffic costs into account, without requring full source statistics. Finally, we will discuss implementation and empirical evaluation of these methods in Emerac, our prototype information gathering system.