Class-based n-gram models of natural language
Computational Linguistics
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Named entity recognition using an HMM-based chunk tagger
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Ranking algorithms for named-entity extraction: boosting and the voted perceptron
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Feature-rich part-of-speech tagging with a cyclic dependency network
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Accurate unlexicalized parsing
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
An improved error model for noisy channel spelling correction
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
A simple rule-based part of speech tagger
HLT '91 Proceedings of the workshop on Speech and Natural Language
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Information extraction from voicemail transcripts
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A robust risk minimization based named entity recognition system
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Incorporating non-local information into information extraction systems by Gibbs sampling
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
An effective two-stage model for exploiting non-local dependencies in named entity recognition
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Extracting personal names from email: applying named entity recognition to informal text
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Reranking for biomedical named-entity recognition
BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
Design challenges and misconceptions in named entity recognition
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Domain adaptation with latent semantic association for named entity recognition
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Locating complex named entities in web text
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
An unsupervised model for text message normalization
CALC '09 Proceedings of the Workshop on Computational Approaches to Linguistic Creativity
Unsupervised named-entity extraction from the Web: An experimental study
Artificial Intelligence
Annotating and recognising named entities in clinical notes
ACLstudent '09 Proceedings of the ACL-IJCNLP 2009 Student Research Workshop
Nested named entity recognition
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Domain adaptive bootstrapping for named entity recognition
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Minimally-supervised extraction of entities from text advertisements
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Annotating named entities in Twitter data with crowdsourcing
CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Domain adaptation of rule-based annotators for named-entity recognition tasks
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Lexical normalisation of short text messages: makn sens a #twitter
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Hi-index | 0.00 |
Two main challenges of Named Entity Recognition (NER) for tweets are the insufficient information in a tweet and the lack of training data. We propose a novel method consisting of three core elements: (1) normalization of tweets; (2) combination of a K-Nearest Neighbors (KNN) classifier with a linear Conditional Random Fields (CRF) model; and (3) semisupervised learning framework. The tweet normalization preprocessing corrects common ill-formed words using a global linear model. The KNN-based classifier conducts prelabeling to collect global coarse evidence across tweets while the CRF model conducts sequential labeling to capture fine-grained information encoded in a tweet. The semisupervised learning plus the gazetteers alleviate the lack of training data. Extensive experiments show the advantages of our method over the baselines as well as the effectiveness of normalization, KNN, and semisupervised learning.