Finding regular simple paths in graph databases
VLDB '89 Proceedings of the 15th international conference on Very large data bases
Query optimization in the presence of limited access patterns
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Multiobjective query optimization
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient evaluation of queries in a mediator for WebSources
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Mariposa: a wide-area distributed database system
The VLDB Journal — The International Journal on Very Large Data Bases
Efficiently Ordering Query Plans for Data Integration
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Efficient approximation of optimization queries under parametric aggregation constraints
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Quality-driven query answering for integrated information systems
Quality-driven query answering for integrated information systems
Query Planning for Searching Inter-dependent Deep-Web Databases
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
An access cost-aware approach for object retrieval over multiple sources
Proceedings of the VLDB Endowment
Knowledge-based reasoning through stigmergic linking
IWSOS'07 Proceedings of the Second international conference on Self-Organizing Systems
Querying graphs with preferences
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
Life sciences sources are characterized by a complex graph of overlapping sources, and multiple alternate links between sources. A (navigational) query may be answered by traversing multiple alternate paths between a start source and a target source. Each of these paths may have dissimilar benefit, e.g., the cardinality of result objects that are reached in the target source. Paths may also have dissimilar costs of evaluation, i.e., the execution cost of a query evaluation plan for a path. In prior research, we developed ESearch, an algorithm based on a Deterministic Finite Automaton (DFA), which exhaustively enumerates all paths to answer a navigational query. The challenge is to develop heuristics that improve on the exhaustive ESearch solution and identify good utility functions that can rank the sources, the links between sources, and the sub-paths that are already visited, in order to quickly produce paths that have the highest benefit and the least cost. In this paper, we present a heuristic that uses local utility functions to rank sources, using either the benefit attributed to the source, the cost of a plan using the source, or both. The heuristic will limit its search to some Top XX% of the ranked sources. To compare ESearch and the heuristic, we construct a Pareto surface of all dominant solutions produced by ESearch, with respect to benefit and cost. We choose the Top 25% of the ESearch solutions that are in the Pareto surface. We compare the paths produced by the heuristic to this Top 25% of ESearch solutions with respect to precision and recall. This motivates the need for further research on developing a more efficient algorithm and better utility functions.