Some advances in transformation-based part of speech tagging
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
The ups and downs of lexical acquisition
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Coping with ambiguity and unknown words through probabilistic models
Computational Linguistics - Special issue on using large corpora: II
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
Noun-phrase co-occurrence statistics for semiautomatic semantic lexicon construction
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Towards a self-extending parser
ACL '79 Proceedings of the 17th annual meeting on Association for Computational Linguistics
Word-sense disambiguation using statistical models of Roget's categories trained on large corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors
OOIS '02 Proceedings of the Workshops on Advances in Object-Oriented Information Systems
The head-modifier principle and multilingual term extraction
Natural Language Engineering
Automatic expansion of domain-specific lexicons by term categorization
ACM Transactions on Speech and Language Processing (TSLP)
Ontology learning: state of the art and open issues
Information Technology and Management
Lexical acquisition and clustering of word senses to conceptual lexicon construction
Computers & Mathematics with Applications
Generalized Expectation Criteria for Semi-Supervised Learning with Weakly Labeled Data
The Journal of Machine Learning Research
Developing an open-source, rule-based proofreading tool
Software—Practice & Experience
Recognition and extraction of definitional contexts in Spanish for sketching a lexical network
YIWCALA '10 Proceedings of the NAACL HLT 2010 Young Investigators Workshop on Computational Approaches to Languages of the Americas
Semantic relations in bilingual lexicons
ACM Transactions on Speech and Language Processing (TSLP)
Feasibility of enriching a chinese synonym dictionary with a synchronous chinese corpus
FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
Hi-index | 0.00 |
Many applications need a lexicon that represents semantic information but acquiring lexical information is time consuming. We present a corpus-based bootstrapping algorithm that assists users in creating domain-specific semantic lexicons quickly. Our algorithm uses a representative text corpus for the domain and a small set of ‘seed words’ that belong to a semantic class of interest. The algorithm hypothesizes new words that are also likely to belong to the semantic class because they occur in the same contexts as the seed words. The best hypotheses are added to the seed word list dynamically, and the process iterates in a bootstrapping fashion. When the bootstrapping process halts, a ranked list of hypothesized category words is presented to a user for review. We used this algorithm to generate a semantic lexicon for eleven semantic classes associated with the MUC-4 terrorism domain.