A cluster-based approach to thesaurus construction
SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
Class-based n-gram models of natural language
Computational Linguistics
Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Deriving concept hierarchies from text
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Explorations in Automatic Thesaurus Discovery
Explorations in Automatic Thesaurus Discovery
An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Automatic Detection of Thesaurus relations for Information Retrieval Applications
Foundations of Computer Science: Potential - Theory - Cognition, to Wilfried Brauer on the occasion of his sixtieth birthday
Partial parsing via finite-state cascades
Natural Language Engineering
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Automatic acquisition of hyponyms from large text corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Automatic construction of a hypernym-labeled noun hierarchy from text
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Scaling to very very large corpora for natural language disambiguation
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Class-based probability estimation using a semantic hierarchy
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Robust, applied morphological generation
INLG '00 Proceedings of the first international conference on Natural language generation - Volume 14
Improvements in automatic thesaurus extraction
ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Using the web to obtain frequencies for unseen bigrams
Computational Linguistics - Special issue on web as corpus
Improvements in automatic thesaurus extraction
ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Ensemble methods for automatic thesaurus extraction
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A very very large corpus doesn't always yield reliable estimates
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Blueprint for a high performance NLP infrastructure
SEALTS '03 Proceedings of the HLT-NAACL 2003 workshop on Software engineering and architecture of language technology systems - Volume 8
Supersense tagging of unknown nouns using semantic similarity
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Randomized algorithms and NLP: using locality sensitive hash function for high speed noun clustering
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Towards terascale knowledge acquisition
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Dependency-Based Construction of Semantic Space Models
Computational Linguistics
MetaCoDe: A Lightweight UMLS Mapping Tool
AIME '07 Proceedings of the 11th conference on Artificial Intelligence in Medicine
Methodological Review: Empirical distributional semantics: Methods and biomedical applications
Journal of Biomedical Informatics
Random indexing using statistical weight functions
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Approximate searching for distributional similarity
DeepLA '05 Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition
Data selection in semi-supervised learning for name tagging
IEBeyondDoc '06 Proceedings of the Workshop on Information Extraction Beyond The Document
Lexical acquisition for clinical text mining using distributional similarity
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
SemEval-2012 task 4: evaluating Chinese word similarity
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Automatic thesaurus construction for cross generation corpus
Journal on Computing and Cultural Heritage (JOCCH)
Hi-index | 0.00 |
Context is used in many NLP systems as an indicator of a term's syntactic and semantic function. The accuracy of the system is dependent on the quality and quantity of contextual information available to describe each term. However, the quantity variable is no longer fixed by limited corpus resources. Given fixed training time and computational resources, it makes sense for systems to invest time in extracting high quality contextual information from a fixed corpus. However, with an effectively limitless quantity of text available, extraction rate and representation size need to be considered. We use thesaurus extraction with a range of context extracting tools to demonstrate the interaction between context quantity, time and size on a corpus of 300 million words.