Using corpus-derived name lists for named entity recognition

Authors:
Mark Stevenson;Robert Gaizauskas
Affiliations:
University of Sheffield, Sheffield, United Kingdom;University of Sheffield, Sheffield, United Kingdom
Venue:
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Year:
2000

Citing 5
Cited 12

Internal and external evidence in the identification and semantic categorization of proper names

Corpus processing for lexical acquisition
Information Retrieval

Information Retrieval
A simple rule-based part of speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
Named Entity recognition without gazetteers

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Evaluation of an algorithm for the recognition and classification of proper names

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1

Rutabaga by any other name: extracting biological names

Journal of Biomedical Informatics - Special issue: Sublanguage
Acquisition of categorized named entities for web search

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Automatic summarization of voicemail messages using lexical and prosodic features

ACM Transactions on Speech and Language Processing (TSLP)
FASIL email summarisation system

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Extracting personal names from email: applying named entity recognition to informal text

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
An adaptive approach to named entity extraction for meeting applications

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Weakly-supervised discovery of named entities using web search queries

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Implementation of Croatian NERC system

ACL '07 Proceedings of the Workshop on Balto-Slavonic Natural Language Processing: Information Extraction and Enabling Technologies
Weakly-supervised acquisition of labeled class instances using graph random walks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Multi-level NER for Portuguese in a CG framework

PROPOR'03 Proceedings of the 6th international conference on Computational processing of the Portuguese language
Named entity recognition for Arabic using syntactic grammars

NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
An automatic approach for ontology-based feature extraction from heterogeneous textualresources

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes experiments to establish the performance of a named entity recognition system which builds categorized lists of names from manually annotated training data. Names in text are then identified using only these lists. This approach does not perform as well as state-of-the-art named entity recognition systems. However, we then show that by using simple filtering techniques for improving the automatically acquired lists, substantial performance benefits can be achieved, with resulting F-measure scores of 87% on a standard test set. These results provide a baseline against which the contribution of more sophisticated supervised learning techniques for NE recognition should be measured.