Incomplete Information in Relational Databases
Journal of the ACM (JACM)
Communications of the ACM
On the representation and querying of sets of possible worlds
SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
Extended algebra and calculus for nested relational databases
ACM Transactions on Database Systems (TODS)
Fuzzy set theory—and its applications (3rd ed.)
Fuzzy set theory—and its applications (3rd ed.)
The object database standard: ODMG 2.0
The object database standard: ODMG 2.0
Inverted files versus signature files for text indexing
ACM Transactions on Database Systems (TODS)
A new method for similarity indexing of market basket data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Efficient and tumble similar set retrieval
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Object Relational DBMSs: The Next Great Wave
Object Relational DBMSs: The Next Great Wave
Set Containment Joins: The Good, The Bad and The Ugly
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
An Efficient Indexing Technique for Full Text Databases
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Adaptive algorithms for set containment joins
ACM Transactions on Database Systems (TODS)
On the Resemblance and Containment of Documents
SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
Efficient processing of joins on set-valued attributes
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
A performance study of four index structures for set-valued attributes of low cardinality
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient set joins on similarity predicates
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
A Primitive Operator for Similarity Joins in Data Cleaning
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Working Models for Uncertain Data
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Efficient exact set-similarity joins
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Creating probabilistic databases from information extraction models
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Management of probabilistic data: foundations and challenges
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Range search on multidimensional uncertain data
ACM Transactions on Database Systems (TODS)
Efficient indexing methods for probabilistic threshold queries over uncertain data
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Probabilistic skylines on uncertain data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Efficient similarity joins for near duplicate detection
Proceedings of the 17th international conference on World Wide Web
Probabilistic Group Nearest Neighbor Queries in Uncertain Databases
IEEE Transactions on Knowledge and Data Engineering
Efficient search for the top-k probable nearest neighbors in uncertain databases
Proceedings of the VLDB Endowment
Data integration with uncertainty
The VLDB Journal — The International Journal on Very Large Data Bases
Probabilistic Verifiers: Evaluating Constrained Nearest-Neighbor Queries over Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Top-k Spatial Joins of Probabilistic Objects
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data
The VLDB Journal — The International Journal on Very Large Data Bases
Canopy closure estimates with GreenOrbs: sustainable sensing in the forest
Proceedings of the 7th ACM Conference on Embedded Networked Sensor Systems
Reverse skyline search in uncertain databases
ACM Transactions on Database Systems (TODS)
Probabilistic Reverse Nearest Neighbor Queries on Uncertain Data
IEEE Transactions on Knowledge and Data Engineering
Probabilistic string similarity joins
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
On indexing error-tolerant set containment
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Supporting ranking queries on uncertain and incomplete data
The VLDB Journal — The International Journal on Very Large Data Bases
Scalable Probabilistic Similarity Ranking in Uncertain Databases
IEEE Transactions on Knowledge and Data Engineering
Finding the least influenced set in uncertain databases
Information Systems
Combining intensional with extensional query evaluation in tuple independent probabilistic databases
Information Sciences: an International Journal
Set similarity join on probabilistic data
Proceedings of the VLDB Endowment
Probabilistic inverse ranking queries in uncertain databases
The VLDB Journal — The International Journal on Very Large Data Bases
Ranking queries on uncertain data
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient answering of set containment queries for skewed item distributions
Proceedings of the 14th International Conference on Extending Database Technology
Ranking uncertain sky: The probabilistic top-k skyline operator
Information Systems
Semantics of Ranking Queries for Probabilistic Data
IEEE Transactions on Knowledge and Data Engineering
Shooting top-k stars in uncertain databases
The VLDB Journal — The International Journal on Very Large Data Bases
Probabilistic similarity join on uncertain data
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Efficient processing of containment queries on nested sets
Proceedings of the 16th International Conference on Extending Database Technology
Hi-index | 0.07 |
Set-valued data is a natural and concise representation for modeling complex objects. As an important operation of object-oriented or object-relational database, set containment query processing over set-valued data has been extensively studied in previous works. Recently, there is a growing realization that uncertain information is a first-class citizen in modern database management. As such, there is a strong demand for study of set containment queries over uncertain set-valued data. This paper investigates how set-containment queries over uncertain set-valued data can be efficiently processed. Based on the popular possible world semantics, we first present a practical model in which the uncertainty in set-valued data is represented by existential probabilities, and propose the probabilistic set containment semantics and its generalization - the expected Jaccard containment. Second, to avoid expensive computations in enumerating all possible worlds, we develop efficient schemes for computing these two probabilistic semantics. Third, we introduce two important queries, namely probability threshold containment query (PTCQ) and probability threshold containment join (PTCJ), and propose novel techniques to process them efficiently. Finally, we conduct extensive experiments to study the efficiency of the proposed methods. The experimental results indicate that the proposed methods are efficient in processing the uncertain set containment queries.