Building ranked mashups of unstructured sources with uncertain information

Authors:
Mohamed A. Soliman;Ihab F. Ilyas;Mina Saleeb
Affiliations:
University of Waterloo;University of Waterloo;University of Waterloo
Venue:
Proceedings of the VLDB Endowment
Year:
2010

Citing 17
Cited 4

Supporting Incremental Join Queries on Ranked Inputs

Proceedings of the 27th International Conference on Very Large Data Bases
RoadRunner: Towards Automatic Data Extraction from Large Web Sites

Proceedings of the 27th International Conference on Very Large Data Bases
Supporting top-k join queries in relational databases

The VLDB Journal — The International Journal on Very Large Data Bases
RankSQL: query algebra and optimization for relational top-k queries

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Closest-Point-of-Approach Join for Moving Object Histories

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Ranking queries on uncertain data: a probabilistic threshold approach

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Damia: data mashups for intranet applications

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Evaluating rank joins with optimal cost

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A survey of top-k query processing techniques in relational database systems

ACM Computing Surveys (CSUR)
Automatic wrapper induction from hidden-web sources with domain knowledge

Proceedings of the 10th ACM workshop on Web information and data management
Semantics of Ranking Queries for Probabilistic Data and Expected Ranks

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Ranking with Uncertain Scores

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
On the semantics and evaluation of top-k queries in probabilistic databases

ICDEW '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering Workshop
Robust web extraction: an approach based on a probabilistic tree-edit model

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Enabling enterprise mashups over unstructured text feeds with InfoSphere MashupHub and SystemT

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Supporting ranking queries on uncertain and incomplete data

The VLDB Journal — The International Journal on Very Large Data Bases
Ranking continuous probabilistic datasets

Proceedings of the VLDB Endowment

Uncertainty in rank join

Search computing
Diversification for multi-domain result sets

ICWE'12 Proceedings of the 12th international conference on Web Engineering
A top-k filter for logic-based similarity conditions on probabilistic databases

ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
A preference-aware query model for data web services

ER'12 Proceedings of the 31st international conference on Conceptual Modeling

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mashups are situational applications that join multiple sources to better meet the information needs of Web users. Web sources can be huge databases behind query interfaces, which triggers the need of ranking mashup results based on some user preferences. We present MashRank, a mashup authoring and processing system building on concepts from rank-aware processing, probabilistic databases, and information extraction to enable ranked mashups of (unstructured) sources with uncertain ranking attributes. MashRank is based on new semantics, formulations and processing techniques to handle uncertain preference scores, represented as intervals enclosing possible score values. MashRank integrates information extraction with query processing by asynchronously pushing extracted data on-the-fly into pipelined rank-aware query plans, and using ranking early-out requirements to limit extraction cost. To the best of our knowledge, both the technical problems and target applications of MashRank have not been addressed before.