An empirical study of the effects of NLP components on Geographic IR performance

Authors:
Nicola Stokes;Yi Li;Alistair Moffat;Jiawen Rong
Affiliations:
NICTA Victoria Laboratory, Department of Computer Science and Software Engineering, The University of Melbourne, Victoria 3010, Australia;NICTA Victoria Laboratory, Department of Computer Science and Software Engineering, The University of Melbourne, Victoria 3010, Australia;NICTA Victoria Laboratory, Department of Computer Science and Software Engineering, The University of Melbourne, Victoria 3010, Australia;NICTA Victoria Laboratory, Department of Computer Science and Software Engineering, The University of Melbourne, Victoria 3010, Australia
Venue:
International Journal of Geographical Information Science
Year:
2008

Citing 2
Cited 5

Named entity recognition through classifier combination

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Finding predominant word senses in untagged text

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics

Towards a Conceptual Model of Talking to a Route Planner

W2GIS '08 Proceedings of the 8th International Symposium on Web and Wireless Geographical Information Systems
Approaches to disambiguating toponyms

SIGSPATIAL Special
Geographic expansion of queries to improve the geographic information retrieval task

NLDB'12 Proceedings of the 17th international conference on Applications of Natural Language Processing and Information Systems
NICTA I2D2 group at GeoCLEF 2006

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
Applying NLP techniques for query reformulation to information retrieval with geographical references

PAKDD'12 Proceedings of the 2012 Pacific-Asia conference on Emerging Trends in Knowledge Discovery and Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Natural language processing (NLP) techniques, such as toponym detection and resolution, are an integral part of most geographic information retrieval (GIR) architectures. Without these components, synonym detection, ambiguity resolution and accurate toponym expansion would not be possible. However, there are many important factors affecting the success of an NLP approach to GIR, including toponym detection errors, toponym resolution errors and query overloading. The aim of this paper is to determine how severe these errors are in state-of-the-art systems, and to what extent they affect GIR performance. We show that a careful choice of weighting schemes in the IR engine can minimize the negative impact of these errors on GIR accuracy. We provide empirical evidence from the GeoCLEF 2005 and 2006 datasets to support our observations.