Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Snowball: extracting relations from large plain-text collections
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Extracting Patterns and Relations from the World Wide Web
WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Web-scale information extraction in knowitall: (preliminary results)
Proceedings of the 13th international conference on World Wide Web
Automatic acquisition of hyponyms from large text corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Machine Learning
Building structured web community portals: a top-down, compositional, and incremental approach
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Declarative information extraction using datalog with embedded extraction predicates
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Automatically refining the wikipedia infobox ontology
Proceedings of the 17th international conference on World Wide Web
Open information extraction from the web
Communications of the ACM - Surviving the data deluge
YAGO: A Large Ontology from Wikipedia and WordNet
Web Semantics: Science, Services and Agents on the World Wide Web
Foundations and Trends in Databases
Information extraction challenges in managing unstructured data
ACM SIGMOD Record
Using Wikipedia to bootstrap open information extraction
ACM SIGMOD Record
StatSnowball: a statistical approach to extracting entity relationships
Proceedings of the 18th international conference on World wide web
SOFIE: a self-organizing framework for information extraction
Proceedings of the 18th international conference on World wide web
An Algebraic Approach to Rule-Based Information Extraction
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
TextRunner: open information extraction on the web
NAACL-Demonstrations '07 Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
Learning and inference with constraints
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Coupling semi-supervised learning of categories and relations
SemiSupLearn '09 Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing
Coupled semi-supervised learning for information extraction
Proceedings of the third ACM international conference on Web search and data mining
Sig.ma: live views on the web of data
Proceedings of the 19th international conference on World wide web
DBpedia: a nucleus for a web of open data
ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Text2Onto: a framework for ontology learning and data-driven change discovery
NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Scalable knowledge harvesting with high precision and high recall
Proceedings of the fourth ACM international conference on Web search and data mining
DIDO: a disease-determinants ontology from web sources
Proceedings of the 20th international conference companion on World wide web
Ontology-Based information and event extraction for business intelligence
AIMSA'12 Proceedings of the 15th international conference on Artificial Intelligence: methodology, systems, and applications
Hi-index | 0.00 |
We present a robust method for gathering relational facts from the Web, based on matching generalized patterns which are automatically learned from seed facts for relations of interest. Our approach combines these generalized patterns for high recall information extraction with a rule-based, declarative reasoning approach to also ensure high precision. Newly extracted candidate facts are assigned statistical weights which reflect the strengths of the patterns used to extract them. For checking the plausibility of candidate facts with respect to existing knowledge and competing hypotheses, we use an efficient algorithm for weighted Max-Sat over propositional-logic clauses. In contrast to prior work on reasoning-based information extraction, we employ richer statistics and smart pruning to bound the number of grounded rules passed on to the Max-Sat solver.