K-relevance: a spectrum of relevance for data sources impacting a query

Authors:
Jiansheng Huang;Jeffrey F. Naughton
Affiliations:
University of Wisconsin at Madison, Madison, WI;University of Wisconsin at Madison, Madison, WI
Venue:
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Year:
2007

Citing 14
Cited 0

Efficiently updating materialized views

SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
Maintaining views incrementally

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
View maintenance in a warehousing environment

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Efficient view maintenance at data warehouses

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Query Optimization in Database Systems

ACM Computing Surveys (CSUR)
Accelerated focused crawling through online relevance feedback

Proceedings of the 11th international conference on World Wide Web
Maintenance of views

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Lineage Tracing for General Data Warehouse Transformations

Proceedings of the 27th International Conference on Very Large Data Bases
Deriving Production Rules for Incremental View Maintenance

VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Querying Heterogeneous Information Sources Using Source Descriptions

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Practical Lineage Tracing in Data Warehouses

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Lineage retrieval for scientific data processing: a survey

ACM Computing Surveys (CSUR)
TRAC: toward recency and consistency reporting in a database with distributed data sources

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
ULDBs: databases with uncertainty and lineage

VLDB '06 Proceedings of the 32nd international conference on Very large data bases

Quantified Score

Hi-index	0.01

Visualization

Abstract

Applications ranging from grid management to sensor nets to web-based information integration and extraction can be viewed as receiving data from some number of autonomous remote data sources and then answering queries over this collected data. In such environments it is helpful to inform users which data sources are "relevant" to their query results. It is not immediately obvious what "relevant" should mean in this context, as different users will have different requirements. In this paper, rather than proposing a single definition of relevance, we propose a spectrum of definitions, which we term "k-relevance", for k ≥ 0. We give algorithms for identifying k-relevant data sources for relational queries and explore their efficiency both analytically and experimentally. Finally, we explore the impact of integrity constraints (including dependencies) and materialized views on the problem of computing and maintaining relevant data sources.