String distance metrics for reference matching and search query correction

  • Authors:
  • Jakub Piskorski;Marcin Sydow

  • Affiliations:
  • Joint Research Center of the European Commission, Web and Language Technology Group of IPSC, Ispra, VA, Italy;Polish-Japanese Institute of Information Technology, Department of Intelligent Systems, Warsaw, Poland

  • Venue:
  • BIS'07 Proceedings of the 10th international conference on Business information systems
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

String distance metrics have been widely used in various applications concerning processing of textual data. This paper reports on the exploration of their usability for tackling the reference matching task and for the automatic correction of misspelled search engine queries, in the context of highly inflective languages, in particular focusing on Polish. The results of numerous experiments in different scenarios are presented and they revealed some preferred metrics. Surprisingly good results were observed for correcting misspelled search engine queries. Nevertheless, a more in-depth analysis is necessary to achieve improvements. The work reported here constitutes a good point of departure for further research on this topic.