Word association norms, mutual information, and lexicography
Computational Linguistics
DIRT @SBT@discovery of inference rules from text
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Randomized algorithms and NLP: using locality sensitive hash function for high speed noun clustering
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
TextRunner: open information extraction on the web
NAACL-Demonstrations '07 Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
Sentence boundary detection and the problem with the U.S.
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
From frequency to meaning: vector space models of semantics
Journal of Artificial Intelligence Research
Self-training with products of latent variable grammars
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Stanford's multi-pass sieve coreference resolution system at the CoNLL-2011 shared task
CONLL Shared Task '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task
Monolingual distributional similarity for text-to-text generation
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Monolingual distributional similarity for text-to-text generation
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Reporting bias and knowledge acquisition
Proceedings of the 2013 workshop on Automated knowledge base construction
Hi-index | 0.00 |
We have created layers of annotation on the English Gigaword v.5 corpus to render it useful as a standardized corpus for knowledge extraction and distributional semantics. Most existing large-scale work is based on inconsistent corpora which often have needed to be re-annotated by research teams independently, each time introducing biases that manifest as results that are only comparable at a high level. We provide to the community a public reference set based on current state-of-the-art syntactic analysis and coreference resolution, along with an interface for programmatic access. Our goal is to enable broader involvement in large-scale knowledge-acquisition efforts by researchers that otherwise may not have had the ability to produce such a resource on their own.