Joint inference of named entity recognition and normalization for tweets

Authors:
Xiaohua Liu;Ming Zhou;Furu Wei;Zhongyang Fu;Xiangyang Zhou
Affiliations:
School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China and Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;Shanghai Jiao Tong University, Shanghai, China;Shandong University, Jinan, China
Venue:
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Year:
2012

Citing 23
Cited 3

Location normalization for information extraction

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Information extraction from voicemail transcripts

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Extracting personal names from email: applying named entity recognition to informal text

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Named entity normalization in user generated content

Proceedings of the second workshop on Analytics for noisy unstructured text data
Reranking for biomedical named-entity recognition

BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
Design challenges and misconceptions in named entity recognition

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
A skip-chain conditional random field for ranking meeting utterances by importance

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Locating complex named entities in web text

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Unsupervised gene/protein named entity normalization using automatically extracted dictionaries

ISMB '05 Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics
Unsupervised named-entity extraction from the Web: An experimental study

Artificial Intelligence
Arabic cross-document person name normalization

Semitic '07 Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources
Annotating and recognising named entities in clinical notes

ACLstudent '09 Proceedings of the ACL-IJCNLP 2009 Student Research Workshop
Nested named entity recognition

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
The impact of named entity normalization on information retrieval for question answering

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Minimally-supervised extraction of entities from text advertisements

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Annotating named entities in Twitter data with crowdsourcing

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Domain adaptation of rule-based annotators for named-entity recognition tasks

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Recognizing named entities in tweets

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Lexical normalisation of short text messages: makn sens a #twitter

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Loopy belief propagation for approximate inference: an empirical study

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Named entity recognition in tweets: an experimental study

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing

I can do text analytics!: designing development tools for novice developers

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Microblog-genre noise and impact on semantic annotation accuracy

Proceedings of the 24th ACM Conference on Hypertext and Social Media
Joint Optimization for Chinese POS Tagging and Dependency Parsing

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Tweets represent a critical source of fresh information, in which named entities occur frequently with rich variations. We study the problem of named entity normalization (NEN) for tweets. Two main challenges are the errors propagated from named entity recognition (NER) and the dearth of information in a single tweet. We propose a novel graphical model to simultaneously conduct NER and NEN on multiple tweets to address these challenges. Particularly, our model introduces a binary random variable for each pair of words with the same lemma across similar tweets, whose value indicates whether the two related words are mentions of the same entity. We evaluate our method on a manually annotated data set, and show that our method outperforms the baseline that handles these two tasks separately, boosting the F1 from 80.2% to 83.6% for NER, and the Accuracy from 79.4% to 82.6% for NEN, respectively.