Journal of the American Society for Information Science - Special topic issue on the history of documentation and information science: part II
TopCat: Data Mining for Topic Identification in a Text Corpus
PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Termight: identifying and translating technical terminology
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Extracting the names of genes and gene products with a hidden Markov model
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Message Understanding Conference-6: a brief history
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Focused named entity recognition using machine learning
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Using term informativeness for named entity detection
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
GAPSCORE: finding gene and protein names one word at a time
Bioinformatics
Gene name extraction using FlyBase resources
BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Introduction to the bio-entity recognition task at JNLPBA
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Locating complex named entities in web text
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Feature selection techniques for maximum entropy based biomedical named entity recognition
Journal of Biomedical Informatics
A novel approach to automatic gazetteer generation using Wikipedia
People's Web '09 Proceedings of the 2009 Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Hi-index | 0.00 |
Research has shown that topic-oriented words are often related to named entities and can be used for Named Entity Recognition. Many have proposed to measure topicality of words in terms of …informativeness' based on global distributional characteristics of words in a corpus. However, this study shows that there can be large discrepancy between informativeness and topicality; empirically, informativeness based features can damage learning accuracy of NER. This paper proposes to measure words' topicality based on local distributional features specific to individual documents, and proposes methods to transform topicality into gazetteer-like features for NER by binning. Evaluated using five datasets from three domains, the methods have shown consistent improvement over a baseline by between 0.9 and 4.0 in F-measure, and always outperformed methods that use informativeness measures.