Improved algorithms for topic distillation in a hyperlinked environment
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Learning dictionaries for information extraction by multi-level bootstrapping
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Word-sense disambiguation using decomposable models
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Automatic acquisition of hyponyms from large text corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Unsupervised learning of generalized names
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Example selection for bootstrapping statistical parsers
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Corpus-based statistical sense resolution
HLT '93 Proceedings of the workshop on Human Language Technology
Understanding the Yarowsky Algorithm
Computational Linguistics
A bootstrapping method for learning semantic lexicons using extraction pattern contexts
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Bootstrapping POS taggers using unlabelled data
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Speech and Language Processing (2nd Edition)
Speech and Language Processing (2nd Edition)
Espresso: leveraging generic patterns for automatically harvesting semantic relations
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Effective self-training for parsing
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Introduction to Information Retrieval
Introduction to Information Retrieval
Graph-based analysis of semantic drift in Espresso-like bootstrapping algorithms
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Helping editors choose better seed sets for entity set expansion
Proceedings of the 18th ACM conference on Information and knowledge management
Reducing semantic drift with bagging and distributional similarity
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Person name disambiguation by bootstrapping
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
On learning subtypes of the part-whole relation: do not mix your seeds
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Unsupervised discovery of negative categories in lexicon bootstrapping
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Data & Knowledge Engineering
Hi-index | 0.00 |
In bootstrapping (seed set expansion), selecting good seeds and creating stop lists are two effective ways to reduce semantic drift, but these methods generally need human supervision. In this paper, we propose a graph-based approach to helping editors choose effective seeds and stop list instances, applicable to Pantel and Pennacchiotti's Espresso bootstrapping algorithm. The idea is to select seeds and create a stop list using the rankings of instances and patterns computed by Kleinberg's HITS algorithm. Experimental results on a variation of the lexical sample task show the effectiveness of our method.