The Journal of Machine Learning Research
Noun-phrase co-occurrence statistics for semiautomatic semantic lexicon construction
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Unsupervised learning of generalized names
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
A bootstrapping approach to named entity classification using successive learners
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A bootstrapping method for learning semantic lexicons using extraction pattern contexts
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
ICML '06 Proceedings of the 23rd international conference on Machine learning
Topical N-Grams: Phrase and Topic Discovery, with an Application to Information Retrieval
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
The ACL Anthology Network corpus
NLPIR4DL '09 Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries
Inducing domain-specific semantic class taggers from (almost) nothing
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Rediscovering ACL discoveries through the lens of ACL anthology network citing sentences
ACL '12 Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries
Discovering factions in the computational linguistics community
ACL '12 Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries
Hi-index | 0.00 |
This paper studies the importance of identifying and categorizing scientific concepts as a way to achieve a deeper understanding of the research literature of a scientific community. To reach this goal, we propose an unsupervised bootstrapping algorithm for identifying and categorizing mentions of concepts. We then propose a new clustering algorithm that uses citations' context as a way to cluster the extracted mentions into coherent concepts. Our evaluation of the algorithms against gold standards shows significant improvement over state-of-the-art results. More importantly, we analyze the computational linguistic literature using the proposed algorithms and show four different ways to summarize and understand the research community which are difficult to obtain using existing techniques.