A probabilistic relational model and algebra
ACM Transactions on Database Systems (TODS)
ProbView: a flexible probabilistic database system
ACM Transactions on Database Systems (TODS)
An introduction to variational methods for graphical models
Learning in graphical models
Improving the mean field approximation via the use of mixture distributions
Learning in graphical models
Learning to Parse Natural Language with Maximum Entropy Models
Machine Learning - Special issue on natural language learning
Relational learning of pattern-match rules for information extraction
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Automatic segmentation of text into structured records
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
The Management of Probabilistic Data
IEEE Transactions on Knowledge and Data Engineering
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A Probabilistic Framework for Vague Queries and Imprecise Information in Databases
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Evaluating probabilistic queries over imprecise data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Mining reference tables for automatic text segmentation
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Aggregate operators in probabilistic databases
Journal of the ACM (JACM)
MYSTIQ: a system for finding more answers by using probabilities
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Shallow parsing with conditional random fields
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
OLAP over uncertain and imprecise data
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Working Models for Uncertain Data
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Information extraction from research papers using conditional random fields
Information Processing and Management: an International Journal
Efficient inference on sequence segmentation models
ICML '06 Proceedings of the 23rd international conference on Machine learning
Clustering with Bregman Divergences
The Journal of Machine Learning Research
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Tree-based reparameterization framework for analysis of sum-product and related algorithms
IEEE Transactions on Information Theory
Management of probabilistic data: foundations and challenges
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Canonicalization of database records using adaptive similarity measures
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Management of data with uncertainties
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
OLAP over imprecise data with domain constraints
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Materialized views in probabilistic databases: for information exchange and query optimization
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Query processing over incomplete autonomous databases
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Probabilistic graphical models and their role in databases
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
MCDB: a monte carlo approach to managing uncertain data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Toward best-effort information extraction
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
ACM SIGACT News
Parameter Learning in Probabilistic Databases: A Least Squares Approach
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
BayesStore: managing large, uncertain data repositories with probabilistic graphical models
Proceedings of the VLDB Endowment
Exploiting shared correlations in probabilistic databases
Proceedings of the VLDB Endowment
Systems aspects of probabilistic data management
Proceedings of the VLDB Endowment
Foundations and Trends in Databases
A quality-aware optimizer for information extraction
ACM Transactions on Database Systems (TODS)
Probabilistic databases: diamonds in the dirt
Communications of the ACM - Barbara Liskov: ACM's A.M. Turing Award Winner
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Consensus answers for queries over probabilistic databases
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Uncertainty management in rule-based information extraction systems
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Indexing correlated probabilistic databases
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Large-scale uncertainty management systems: learning and exploiting your data
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
The trichotomy of HAVING queries on a probabilistic database
The VLDB Journal — The International Journal on Very Large Data Bases
The VLDB Journal — The International Journal on Very Large Data Bases
The VLDB Journal — The International Journal on Very Large Data Bases
Creating probabilistic databases from duplicated data
The VLDB Journal — The International Journal on Very Large Data Bases
PrDB: managing and exploiting rich correlations in probabilistic databases
The VLDB Journal — The International Journal on Very Large Data Bases
Answering table augmentation queries from unstructured lists on the web
Proceedings of the VLDB Endowment
A unified approach to ranking in probabilistic databases
Proceedings of the VLDB Endowment
Entity-aware query processing for heterogeneous data with uncertainty and correlations
Proceedings of the 2009 EDBT/ICDT Workshops
Efficient evaluation of HAVING queries on a probabilistic database
DBPL'07 Proceedings of the 11th international conference on Database programming languages
GRN model of probabilistic databases: construction, transition and querying
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Lineage processing over correlated probabilistic databases
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
I4E: interactive investigation of iterative information extraction
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Querying graphs with uncertain predicates
Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Set similarity join on probabilistic data
Proceedings of the VLDB Endowment
Querying probabilistic information extraction
Proceedings of the VLDB Endowment
Tractability in probabilistic databases
Proceedings of the 14th International Conference on Database Theory
A unified approach to ranking in probabilistic databases
The VLDB Journal — The International Journal on Very Large Data Bases
Sensitivity analysis and explanations for robust query evaluation in probabilistic databases
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
The monte carlo database system: Stochastic analysis close to the data
ACM Transactions on Database Systems (TODS)
Probabilistic management of OCR data using an RDBMS
Proceedings of the VLDB Endowment
Efficient processing of probabilistic set-containment queries on uncertain set-valued data
Information Sciences: an International Journal
Towards a unified architecture for in-RDBMS analytics
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
P-top-k queries in a probabilistic framework from information extraction models
Computers & Mathematics with Applications
Ontology-based access to probabilistic data with OWL QL
ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I
Optimal hashing schemes for entity matching
Proceedings of the 22nd international conference on World Wide Web
Top-k entities query processing on uncertainly fused multi-sensory data
Personal and Ubiquitous Computing
Anytime approximation in probabilistic databases
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
Many real-life applications depend on databases automatically curated from unstructured sources through imperfect structure extraction tools. Such databases are best treated as imprecise representations of multiple extraction possibli-ties. State-of-the-art statistical models of extraction provide a sound probability distribution over extractions but are not easy to represent and query in a relational framework. In this paper we address the challenge of approximating such distributions as imprecise data models. In particular, we investigate a model that captures both row-level and column-level uncertainty and show that this representation provides significantly better approximation compared to models that use only row or only column level uncertainty. We present efficient algorithms for finding the best approximating parameters for such a model: our algorithm exploits the structure of the model to avoid enumerating the exponential number of extraction possibilities.