Using co-occurrence models for placename disambiguation

Authors:
Simon Overell;Stefan Rüger
Affiliations:
Multimedia and Information Systems, Department of Computing, South Kensington Campus, Imperial College London, London SW7 2AZ, UK;Multimedia and Information Systems, Department of Computing, South Kensington Campus, Imperial College London, London SW7 2AZ, UK,Knowledge Media Institute, the Open University, Milton Keynes, UK
Venue:
International Journal of Geographical Information Science
Year:
2008

Citing 26
Cited 7

Using statistical testing in the evaluation of retrieval experiments

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Making large-scale support vector machine learning practical

Advances in kernel methods
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Learning Decision Lists

Machine Learning
Qualitative Spatial Representation for Information Retrieval by Gazetteers

COSIT 2001 Proceedings of the International Conference on Spatial Information Theory: Foundations of Geographic Information Science
Multi-resolution disambiguation of term occurrences

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Disambiguation of proper names in text

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Decision lists for lexical ambiguity resolution: application to accent restoration in Spanish and French

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Web-a-where: geotagging web content

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Toponym resolution in text (abstract only): "which sheffield is it?"

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
On assigning place names to geography related web pages

Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
One sense per collocation

HLT '93 Proceedings of the workshop on Human Language Technology
Indexing and ranking in Geo-IR systems

Proceedings of the 2005 workshop on Geographic information retrieval
Grounding spatial named entities for information extraction and question answering

HLT-NAACL-GEOREF '03 Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references - Volume 1
InfoXtract location normalization: a hybrid approach to geographic references in information extraction

HLT-NAACL-GEOREF '03 Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references - Volume 1
Bootstrapping toponym classifiers

HLT-NAACL-GEOREF '03 Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references - Volume 1
A confidence-based framework for disambiguating geographic terms

HLT-NAACL-GEOREF '03 Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references - Volume 1
Quantifying the accuracy of relational statements in Wikipedia: a methodology

Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Disambiguating toponyms in news

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
ESpotter: adaptive named entity recognition for web browsing

WM'05 Proceedings of the Third Biennial conference on Professional Knowledge Management
The XLDB group at GeoCLEF 2005

CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
GeoCLEF 2006: the CLEF 2006 cross-language geographic information retrieval track overview

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
University of twente at GeoCLEF 2006: geofiltered document retrieval

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
The University of Lisbon at GeoCLEF 2006

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval

An efficient location extraction algorithm by leveraging web contextual information

Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems
Exploring Wikipedia and text features for named entity disambiguation

ACIIDS'10 Proceedings of the Second international conference on Intelligent information and database systems: Part II
Toponym resolution in social media

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Multifaceted toponym recognition for streaming news

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
News contextualization with geographic and visual information

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Semantic extraction of geographic data from web tables for big data integration

Proceedings of the 7th Workshop on Geographic Information Retrieval
Towards Platial Joins and Buffers in Place-Based GIS

Proceedings of The First ACM SIGSPATIAL International Workshop on Computational Models of Place

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes the generation of a model capturing information on how placenames co-occur together. The advantages of the co-occurrence model over traditional gazetteers are discussed and the problem of placename disambiguation is presented as a case study. We begin by outlining the problem of ambiguous placenames. We demonstrate how analysis of Wikipedia can be used in the generation of a co-occurrence model. The accuracy of our model is compared to a handcrafted ground truth; then we evaluate alternative methods of applying this model to the disambiguation of placenames in free text (using the GeoCLEF evaluation forum). We conclude by showing how the inclusion of placenames in both the text and geographic parts of a query provides the maximum mean average precision and outline the benefits of a co-occurrence model as a data source for the wider field of geographic information retrieval (GIR).