Syntax-based semi-supervised named entity tagging

Authors:
Behrang Mohit;Rebecca Hwa
Affiliations:
University of Pittsburgh, Pittsburgh, PA;University of Pittsburgh, Pittsburgh, PA
Venue:
ACLdemo '05 Proceedings of the ACL 2005 on Interactive poster and demonstration sessions
Year:
2005

Citing 4
Cited 5

An Algorithm that Learns What‘s in a Name

Machine Learning - Special issue on natural language learning
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Three generative, lexicalised models for statistical parsing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4

Combining data-driven systems for improving Named Entity Recognition

Data & Knowledge Engineering
Entity extraction is a boring solved problem: or is it?

NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
A Semi-supervised Approach for Maximum Entropy Based Hindi Named Entity Recognition

PReMI '09 Proceedings of the 3rd International Conference on Pattern Recognition and Machine Intelligence
Combining proper name-coreference with conditional random fields for semi-supervised named entity recognition in Vietnamese text

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
A hybrid approach of pattern extraction and semi-supervised learning for vietnamese named entity recognition

ICCCI'12 Proceedings of the 4th international conference on Computational Collective Intelligence: technologies and applications - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

We report an empirical study on the role of syntactic features in building a semi-supervised named entity (NE) tagger. Our study addresses two questions: What types of syntactic features are suitable for extracting potential NEs to train a classifier in a semi-supervised setting? How good is the resulting NE classifier on testing instances dissimilar from its training data? Our study shows that constituency and dependency parsing constraints are both suitable features to extract NEs and train the classifier. Moreover, the classifier showed significant accuracy improvement when constituency features are combined with new dependency feature. Furthermore, the degradation in accuracy on unfamiliar test cases is low, suggesting that the trained classifier generalizes well.