Two-stage NER for tweets with clustering

Authors:
Xiaohua Liu;Ming Zhou
Affiliations:
Harbin Institute of Technology, Harbin 150001, China and Natural Language Computing Group, Microsoft Research Asia, Beijing 100080, China;Natural Language Computing Group, Microsoft Research Asia, Beijing 100080, China
Venue:
Information Processing and Management: an International Journal
Year:
2013

Citing 28
Cited 0

Class-based n-gram models of natural language

Computational Linguistics
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Named entity recognition using an HMM-based chunk tagger

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Feature-rich part-of-speech tagging with a cyclic dependency network

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Accurate unlexicalized parsing

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A simple rule-based part of speech tagger

HLT '91 Proceedings of the workshop on Speech and Natural Language
Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Information extraction from voicemail transcripts

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Incorporating non-local information into information extraction systems by Gibbs sampling

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
An effective two-stage model for exploiting non-local dependencies in named entity recognition

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Extracting personal names from email: applying named entity recognition to informal text

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Reranking for biomedical named-entity recognition

BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
Design challenges and misconceptions in named entity recognition

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Domain adaptation with latent semantic association for named entity recognition

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Locating complex named entities in web text

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Unsupervised named-entity extraction from the Web: An experimental study

Artificial Intelligence
Annotating and recognising named entities in clinical notes

ACLstudent '09 Proceedings of the ACL-IJCNLP 2009 Student Research Workshop
Nested named entity recognition

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Minimally-supervised extraction of entities from text advertisements

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Using document level cross-event inference to improve event extraction

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Annotating named entities in Twitter data with crowdsourcing

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Domain adaptation of rule-based annotators for named-entity recognition tasks

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Recognizing named entities in tweets

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Lexical normalisation of short text messages: makn sens a #twitter

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Named entity recognition in tweets: an experimental study

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Collective semantic role labeling for tweets with clustering

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three

Quantified Score

Hi-index	0.00

Visualization

Abstract

One main challenge of Named Entities Recognition (NER) for tweets is the insufficient information in a single tweet, owing to the noisy and short nature of tweets. We propose a novel system to tackle this challenge, which leverages redundancy in tweets by conducting two-stage NER for multiple similar tweets. Particularly, it first pre-labels each tweet using a sequential labeler based on the linear Conditional Random Fields (CRFs) model. Then it clusters tweets to put tweets with similar content into the same group. Finally, for each cluster it refines the labels of each tweet using an enhanced CRF model that incorporates the cluster level information, i.e., the labels of the current word and its neighboring words across all tweets in the cluster. We evaluate our method on a manually annotated dataset, and show that our method boosts the F1 of the baseline without collectively labeling from 75.4% to 82.5%.