Issues in searching for Indian language web content

  • Authors:
  • Dipasree Pal;Prasenjit Majumder;Mandar Mitra;Sukanya Mitra;Aparajita Sen

  • Affiliations:
  • Indian Statistical Institute, Kolkata, India;Indian Statistical Institute, Kolkata, India;Indian Statistical Institute, Kolkata, India;Indian Statistical Institute, Kolkata, India;Indian Statistical Institute, Kolkata, India

  • Venue:
  • Proceedings of the 2nd ACM workshop on Improving non english web searching
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper looks at the problem of searching for Indian language (IL) content on the Web. Even though the amount of IL content that is available on the Web is growing rapidly, searching through this content using the most popular websearch engines poses certain problems. Since the popular search engines do not use any stemming / orthographic normalization for Indian languages, recall levels for IL searches can be low. We provide some examples to indicate the extent of this problem, and suggest a simple and efficient solution to the problem.