Indexing correlated probabilistic databases
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
PrDB: managing and exploiting rich correlations in probabilistic databases
The VLDB Journal — The International Journal on Very Large Data Bases
Journal of Computer and System Sciences
Hi-index | 0.00 |
Increasing numbers of real-world application domains are generating data that is inherently noisy, incomplete, and probabilistic in nature. Statistical inference and probabilistic modeling often introduce another layer of uncertainty on top of that. Examples of such data include measurement data collected by sensor networks, observation data in the context of social networks, scientific and biomedical data, and data collected by various online cyber-sources. Over the last few years, numerous approaches have been proposed, and several systems built, to integrate uncertainty into databases. However, these approaches typically make simplistic and restrictive assumptions concerning the types of uncertainties that can be represented. Most importantly, they often make highly restrictive independence assumptions, and cannot easily model rich correlations among the tuples or attribute values. Furthermore, they typically lack support for specifying uncertainties at different levels of abstractions, needed to handle large-scale uncertain datasets.