Named Entity recognition without gazetteers

Authors:
Andrei Mikheev;Marc Moens;Claire Grover
Affiliations:
University of Edinburgh, Edinburgh, UK;University of Edinburgh, Edinburgh, UK;University of Edinburgh, Edinburgh, UK
Venue:
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Year:
1999

Citing 7
Cited 85

Internal and external evidence in the identification and semantic categorization of proper names

Corpus processing for lexical acquisition
A statistical profile of the Named Entity task

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Feature lattices for maximum entropy modelling

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Automatic semantic tagging of unknown proper names

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Overview of results of the MUC-6 evaluation

MUC6 '95 Proceedings of the 6th conference on Message understanding
University of Durham: description of the LOLITA system as used in MUC-6

MUC6 '95 Proceedings of the 6th conference on Message understanding
SRA: description of the SRA system as used for MUC-6

MUC6 '95 Proceedings of the 6th conference on Message understanding

A Web Information Extraction System to DB Prototyping

NLDB '02 Proceedings of the 6th International Conference on Applications of Natural Language to Information Systems-Revised Papers
Inferential Information Extraction

Information Extraction: Towards Scalable, Adaptable Systems
A Term-Based Methodology for Template Creation in Information Extraction

NLP '00 Proceedings of the Second International Conference on Natural Language Processing
Geographical information recognition and visualization in texts written in various languages

Proceedings of the 2004 ACM symposium on Applied computing
An empirically based system for processing definite descriptions

Computational Linguistics
Using corpus-derived name lists for named entity recognition

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Finite-state transducer cascades to extract named entities in texts

Theoretical Computer Science - Implementation and application automata
Corpus-based development and evaluation of a system for processing definite descriptions

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Mining reference tables for automatic text segmentation

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
KIM – a semantic platform for information extraction and retrieval

Natural Language Engineering
Acquisition of categorized named entities for web search

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Using visual cues for extraction of tabular data from arbitrary HTML documents

WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Machine learning-based named entity recognition via effective integration of various evidences

Natural Language Engineering
The multilingual named entity recognition framework

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Named entity recognition: a maximum entropy approach using global information

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Translating named entities using monolingual and bilingual resources

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Named entity recognition using an HMM-based chunk tagger

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Extracting metadata for spatially-aware information retrieval on the internet

Proceedings of the 2005 workshop on Geographic information retrieval
Challenges and resources for evaluating geographical IR

Proceedings of the 2005 workshop on Geographic information retrieval
A WordNet-based approach to Named Entities recognition

SEMANET '02 Proceedings of the 2002 workshop on Building and using semantic networks - Volume 11
Language independent named entity classification by modified transformation-based learning and by decision tree induction

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Chinese Named Entity Recognition combining a statistical model with human knowledge

MultiNER '03 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15
Semi-supervised learning of geographical gazetteers from the internet

HLT-NAACL-GEOREF '03 Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references - Volume 1
Grounding spatial named entities for information extraction and question answering

HLT-NAACL-GEOREF '03 Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references - Volume 1
Named entity recognition in a South African context

SAICSIT '06 Proceedings of the 2006 annual research conference of the South African institute of computer scientists and information technologists on IT research in developing countries
Incorporating non-local information into information extraction systems by Gibbs sampling

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
An effective two-stage model for exploiting non-local dependencies in named entity recognition

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A Method for Estimating the Precision of Placename Matching

IEEE Transactions on Knowledge and Data Engineering
Named entity translation: extended abstract

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Towards a context model driven german geo-tagging system

Proceedings of the 4th ACM workshop on Geographical information retrieval
Exploring term selection for geographic blind feedback

Proceedings of the 4th ACM workshop on Geographical information retrieval
Geo-tagging for imprecise regions of different sizes

Proceedings of the 4th ACM workshop on Geographical information retrieval
The design and implementation of SPIRIT: a spatially aware search engine for information retrieval on the Internet

International Journal of Geographical Information Science
Anonymisation of Swedish Clinical Data

AIME '07 Proceedings of the 11th conference on Artificial Intelligence in Medicine
Reviewing and Evaluating Automatic Term Recognition Techniques

GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Mapping geographic coverage of the web

Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems
Extracting geographic features from the Internet to automatically build detailed regional gazetteers

International Journal of Geographical Information Science
Business Specific Online Information Extraction from German Websites

CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
User Evaluation Study of a Tagging Approach to Semantic Mapping

ESWC 2009 Heraklion Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications
NLP Techniques for Term Extraction and Ontology Population

Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
Implementation of Croatian NERC system

ACL '07 Proceedings of the Workshop on Balto-Slavonic Natural Language Processing: Information Extraction and Enabling Technologies
Exploiting context for biomedical entity recognition: from syntax to the web

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Adapting an NER-system for German to the biomedical domain

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
A simple feature-copying approach for long-distance dependencies

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Learning-based named entity recognition for morphologically-rich, resource-scarce languages

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Analysing Wikipedia and gold-standard corpora for NER training

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Entity extraction is a boring solved problem: or is it?

NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
The impact of morphological stemming on Arabic mention detection and coreference resolution

Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
Extraction and exploration of spatio-temporal information in documents

Proceedings of the 6th Workshop on Geographic Information Retrieval
Grading knowledge: extracting degree information from texts

Grading knowledge: extracting degree information from texts
Spoken information extraction from Italian broadcast news

ECIR'03 Proceedings of the 25th European conference on IR research
Recognizing biomedical named entities in Chinese research abstracts

Canadian AI'08 Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on Advances in artificial intelligence
Pattern-based extraction of addresses from web page content

APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
ProcessTron: efficient semi-automated markup generation for scientific documents

Proceedings of the 10th annual joint conference on Digital libraries
Annotating large email datasets for named entity recognition with Mechanical Turk

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Multi-source toponym data integration and mediation for a meta-gazetteer service

GIScience'10 Proceedings of the 6th international conference on Geographic information science
Automatic web page annotation with google rich snippets

OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems: Part II
Weighted Vote-Based Classifier Ensemble for Named Entity Recognition: A Genetic Algorithm-Based Approach

ACM Transactions on Asian Language Information Processing (TALIP)
Classifier Ensemble Selection Using Genetic Algorithm for Named Entity Recognition

Research on Language and Computation
A framework for automatic annotation of web pages using the Google rich snippets vocabulary

Proceedings of the 2011 ACM Symposium on Applied Computing
The web is not a person, Berners-Lee is not an organization, and African-Americans are not locations: an analysis of the performance of named-entity recognition

MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Customizing an information extraction system to a new domain

RELMS '11 Proceedings of the ACL 2011 Workshop on Relational Models of Semantics
Inferring specifications for resources from natural language API documentation

Automated Software Engineering
Buy, sell, or hold? information extraction from stock analyst reports

CONTEXT'11 Proceedings of the 7th international and interdisciplinary conference on Modeling and using context
A bootstrapping approach for training a NER with conditional random fields

EPIA'11 Proceedings of the 15th Portugese conference on Progress in artificial intelligence
Extracting named entities using support vector machines

KDLL'06 Proceedings of the 2006 international conference on Knowledge Discovery in Life Science Literature
Creating a testbed for the evaluation of automatically generated back-of-the-book indexes

CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Resolution of data sparseness in named entity recognition using hierarchical features and feature relaxation principle

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Assigning geographical scopes to web pages

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
A golden resource for named entity recognition in portuguese

PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
SIEMÊS – a named-entity recognizer for portuguese relying on similarity rules

PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
What kinds of geographical information are there in the portuguese web?

PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
Unsupervised named-entity recognition: generating gazetteers and resolving ambiguity

AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Using lexical disambiguation and named-entity recognition to improve spelling correction in the electronic patient record

Artificial Intelligence in Medicine
Empirical evaluation of semi-automated XML annotation of text documents with the GoldenGATE editor

ECDL'07 Proceedings of the 11th European conference on Research and Advanced Technology for Digital Libraries
Named entity recognition for Arabic using syntactic grammars

NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
Learning multilingual named entity recognition from Wikipedia

Artificial Intelligence
Improving the performance of a named entity recognition system with knowledge acquisition

EKAW'12 Proceedings of the 18th international conference on Knowledge Engineering and Knowledge Management
WebPut: efficient web-based data imputation

WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Combining multiple classifiers using vote based classifier ensemble technique for named entity recognition

Data & Knowledge Engineering
Transfer joint embedding for cross-domain named entity recognition

ACM Transactions on Information Systems (TOIS)
A reverse engineering approach for automatic annotation of Web pages

Multimedia Tools and Applications
Provenance-based dictionary refinement in information extraction

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Ontology learning: revisted

Journal of Web Engineering
SEED: a framework for extracting social events from press news

Proceedings of the 22nd international conference on World Wide Web companion

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is often claimed that Named Entity recognition systems need extensive gazetteers---lists of names of people, organisations, locations, and other named entities. Indeed, the compilation of such gazetteers is sometimes mentioned as a bottleneck in the design of Named Entity recognition systems.We report on a Named Entity recognition system which combines rule-based grammars with statistical (maximum entropy) models. We report on the system's performance with gazetteers of different types and different sizes, using test material from the MUC-7 competition. We show that, for the text type and task of this competition, it is sufficient to use relatively small gazetteers of well-known names, rather than large gazetteers of low-frequency names. We conclude with observations about the domain independence of the competition and of our experiments.