CYC: a large-scale investment in knowledge infrastructure
Communications of the ACM
Learning dictionaries for information extraction by multi-level bootstrapping
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Snowball: extracting relations from large plain-text collections
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Web-scale information extraction in knowitall: (preliminary results)
Proceedings of the 13th international conference on World Wide Web
Automatic acquisition of hyponyms from large text corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Learning surface text patterns for a Question Answering system
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Espresso: leveraging generic patterns for automatically harvesting semantic relations
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Towards terascale knowledge acquisition
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Random-Walk Term Weighting for Improved Text Classification
ICSC '07 Proceedings of the International Conference on Semantic Computing
EigenRank: a ranking-oriented approach to collaborative filtering
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Weakly-supervised acquisition of labeled class instances using graph random walks
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Searching for common sense: populating Cyc™ from the web
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
LexRank: graph-based lexical centrality as salience in text summarization
Journal of Artificial Intelligence Research
Open information extraction from the web
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
A probabilistic model of redundancy in information extraction
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Unsupervised named-entity extraction from the Web: An experimental study
Artificial Intelligence
Distant supervision for relation extraction without labeled data
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Unsupervised relation extraction by mining Wikipedia texts using information from the web
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Web-scale distributional similarity and entity set expansion
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
DBpedia: a nucleus for a web of open data
ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Self-supervised web search for any-k complete tuples
Proceedings of the 2nd International Workshop on Business intelligencE and the WEB
Class label enhancement via related instances
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Random walk inference and learning in a large scale knowledge base
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Beyond search: Retrieving complete tuples from a text-database
Information Systems Frontiers
Hi-index | 0.00 |
Fact collections are mostly built using semi-supervised relation extraction techniques and wisdom of the crowds methods, rendering them inherently noisy. In this paper, we propose to validate the resulting facts by leveraging global constraints inherent in large fact collections, observing that correct facts will tend to match their arguments with other facts more often than with incorrect ones. We model this intuition as a graph-ranking problem over a fact graph and explore novel random walk algorithms. We present an empirical study, over a large set of facts extracted from a 500 million document webcrawl, validating the model and showing that it improves fact quality over state-of-the-art methods.