ProbView: a flexible probabilistic database system
ACM Transactions on Database Systems (TODS)
Fuzzy queries in multimedia database systems
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Combining fuzzy information from multiple systems
Journal of Computer and System Sciences
Minimal probing: supporting expensive predicates for top-k queries
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Top-k selection queries over relational databases: Mapping strategies and performance evaluation
ACM Transactions on Database Systems (TODS)
The Management of Probabilistic Data
IEEE Transactions on Knowledge and Data Engineering
Incorporating User Preferences in Multimedia Queries
ICDT '97 Proceedings of the 6th International Conference on Database Theory
Supporting Incremental Join Queries on Ranked Inputs
Proceedings of the 27th International Conference on Very Large Data Bases
Evaluating probabilistic queries over imprecise data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Evaluating Top-k Queries over Web-Accessible Databases
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Evaluating top-k queries over web-accessible databases
ACM Transactions on Database Systems (TODS)
Supporting top-k join queries in relational databases
The VLDB Journal — The International Journal on Very Large Data Bases
Top-k query evaluation with probabilistic guarantees
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Anytime measures for top-k algorithms
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Probabilistic ranked queries in uncertain databases
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
IEEE Internet Computing
Anytime measures for top-k algorithms on exact and fuzzy data sets
The VLDB Journal — The International Journal on Very Large Data Bases
Efficiently Answering Probabilistic Threshold Top-k Queries on Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Efficient Processing of Top-k Queries in Uncertain Databases
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Towards approximate SQL: infobright's approach
RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
Parallel data access for multiway rank joins
ICWE'11 Proceedings of the 11th international conference on Web engineering
Search Computing: challenges and Directions
Search Computing: challenges and Directions
Hi-index | 0.00 |
Rank join operators perform a relational join among two or more relations, assign numeric scores to the join results based on a given scoring function, and return K join results with the highest scores, while accessing a subset of data from the input relations. Most of the rank join operators compute a score upper bound for a join result that can be potentially obtained after retrieving the unseen data. A join result is kept in an output buffer, and is deterministically reported to the user if its score is greater than or equal to the score upper bound. The value of the score upper bound decreases subject to further data extraction from the relations. In case of Web services as data sources, which are characterized by non-negligible response time for every data fetch, the value of score upper bound might decrease slowly. Consequently, there is a long delay in reporting a join result stored in the output buffer. This paper addresses the problem of efficiently reporting a top join result obtained by joining the data of two Web services, which are characterized by non-negligible response time. We present a probabilistic reporting method which computes the confidence with which a join result may appear among final top-K joins. It reports a join result as soon as the measure of its confidence exceeds a given threshold. This helps in reporting a join result soon after its observation. An extensive experimental study with various settings of different operating parameters validates the importance of the proposed approach on both real and synthetic data sets. The results show that our proposed approach significantly reduces the average difference between the time when a join result is observed and the time when it is reported, while incurring negligible errors in the final results.