Learning dictionaries for information extraction by multi-level bootstrapping
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Snowball: extracting relations from large plain-text collections
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Web-scale information extraction in knowitall: (preliminary results)
Proceedings of the 13th international conference on World Wide Web
Is it the right answer?: exploiting web redundancy for Answer Validation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Unsupervised named-entity extraction from the web: an experimental study
Artificial Intelligence
Confidence estimation for information extraction
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Adapting Web information extraction knowledge via mining site-invariant and site-dependent features
ACM Transactions on Internet Technology (TOIT)
Espresso: leveraging generic patterns for automatically harvesting semantic relations
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
KnowItNow: fast, scalable information extraction from the web
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Ontologies as facilitators for repurposing web documents
International Journal of Human-Computer Studies
A survey of trust in computer science and the Semantic Web
Web Semantics: Science, Services and Agents on the World Wide Web
A redundancy-based method for the extraction of relation instances from the Web
International Journal of Human-Computer Studies
Proceedings of the 4th international conference on Knowledge capture
Strategies for lifelong knowledge extraction from the web
Proceedings of the 4th international conference on Knowledge capture
Autonomously semantifying wikipedia
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Semantic verification in an online fact seeking environment
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Entity categorization over large document collections
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Open information extraction from the web
Communications of the ACM - Surviving the data deluge
Ontology-driven, unsupervised instance population
Web Semantics: Science, Services and Agents on the World Wide Web
Foundations and Trends in Databases
A quality-aware optimizer for information extraction
ACM Transactions on Database Systems (TODS)
Building query optimizers for information extraction: the SQoUT project
ACM SIGMOD Record
Automatically Harvesting and Ontologizing Semantic Relations
Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
Exploring models for semantic category verification
Information Systems
Exploring models for semantic category verification
Information Systems
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
It's a contradiction---no, it's not: a case study using functional relations
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Harvesting relations from the web: quantifiying the impact of filtering functions
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Unsupervised methods for determining object and relation synonyms on the web
Journal of Artificial Intelligence Research
Open information extraction from the web
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Identifying interesting assertions from the web
Proceedings of the 18th ACM conference on Information and knowledge management
A metric-based framework for automatic taxonomy induction
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Automatic Construction of a Semantic, Domain-Independent Knowledge Base
OTM '09 Proceedings of the Confederated International Workshops and Posters on On the Move to Meaningful Internet Systems: ADI, CAMS, EI2N, ISDE, IWSSA, MONET, OnToContent, ODIS, ORM, OTM Academy, SWWS, SEMELS, Beyond SAWSDL, and COMBEK 2009
Reading to learn: constructing features from semantic abstracts
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Quantifier scope disambiguation using extracted pragmatic knowledge: preliminary results
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Corroborating information from disagreeing views
Proceedings of the third ACM international conference on Web search and data mining
Creating a dead poets society: extracting a social network of historical persons from the web
ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
I4E: interactive investigation of iterative information extraction
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Analysis of a probabilistic model of redundancy in unsupervised information extraction
Artificial Intelligence
Extracting sequences from the web
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Popularity-guided top-k extraction of entity attributes
Procceedings of the 13th International Workshop on the Web and Databases
Semantic role labeling for open information extraction
FAM-LbR '10 Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading
Identifying functional relations in web text
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
FactRank: random walks on a web of facts
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Probabilistic models to reconcile complex data from inaccurate data sources
CAiSE'10 Proceedings of the 22nd international conference on Advanced information systems engineering
A framework for corroborating answers from multiple web sources
Information Systems
Materializing multi-relational databases from the web using taxonomic queries
Proceedings of the fourth ACM international conference on Web search and data mining
Challenges from information extraction to information fusion
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Characterizing the uncertainty of web data: models and experiences
Proceedings of the 2011 Joint WICOW/AIRWeb Workshop on Web Quality
Rules of thumb for information acquisition from large and redundant data
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
An analysis of open information extraction based on semantic role labeling
Proceedings of the sixth international conference on Knowledge capture
Recovering semantics of tables on the web
Proceedings of the VLDB Endowment
Grammatical dependency-based relations for term weighting in text classification
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Cross-lingual slot filling from comparable corpora
BUCC '11 Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web
Towards semantic category verification with arbitrary precision
ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Using the web to validate lexico-semantic relations
EPIA'11 Proceedings of the 15th Portugese conference on Progress in artificial intelligence
Ontology-driven information extraction with ontosyphon
ISWC'06 Proceedings of the 5th international conference on The Semantic Web
Identifying relations for open information extraction
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Probase: a probabilistic taxonomy for text understanding
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Collaboratively built semi-structured content and Artificial Intelligence: The story so far
Artificial Intelligence
Web data reconciliation: models and experiences
Search Computing
A new term ranking method based on relation extraction and graph model for text classification
ACSC '11 Proceedings of the Thirty-Fourth Australasian Computer Science Conference - Volume 113
Exploiting unstructured web information for managing linked data spaces
Proceedings of the 17th Panhellenic Conference on Informatics
Assessing sparse information extraction using semantic contexts
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Reporting bias and knowledge acquisition
Proceedings of the 2013 workshop on Automated knowledge base construction
A survey of noise reduction methods for distant supervision
Proceedings of the 2013 workshop on Automated knowledge base construction
Aggregated search: A new information retrieval paradigm
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
Unsupervised Information Extraction (UIE) is the task of extracting knowledge from text without using hand-tagged training examples. A fundamental problem for both UIE and supervised IE is assessing the probability that extracted information is correct. In massive corpora such as the Web, the same extraction is found repeatedly in different documents. How does this redundancy impact the probability of correctness? This paper introduces a combinatorial "balls-andurns" model that computes the impact of sample size, redundancy, and corroboration from multiple distinct extraction rules on the probability that an extraction is correct. We describe methods for estimating the model's parameters in practice and demonstrate experimentally that for UIE the model's log likelihoods are 15 times better, on average, than those obtained by Pointwise Mutual Information (PMI) and the noisy-or model used in previous work. For supervised IE, the model's performance is comparable to that of Support Vector Machines, and Logistic Regression.