Information Retrieval on the Web

  • Authors:
  • Maristella Agosti;Massimo Melucci

  • Affiliations:
  • -;-

  • Venue:
  • ESSIR '00 Proceedings of the Third European Summer-School on Lectures on Information Retrieval-Revised Lectures
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Information Retrieval (IR) on the Web can be considered from many different perspectives, but one objective and relevant aspect to consider is that on mid-1999 the estimated number of pages being published and available for indexing in the Web was 800 millions for 6 terabytes of textual data. Those Web pages were estimated to be distributed over 3 millions Web servers. This means that anyone cannot effort to explore all the information distributed over those pages, but anyone necessarily needs to be supported by tools that help the end users to choose the most relevant Web pages to answer any specific request of information. The Web has started to operate only 10 years ago, and just few years after the first information retrieval tools have been made available to help Web users to find Web pages with relevant information. To deal with the complexity and heterogeneity of the Web, we need search tools implementing algorithms for indexing and retrieval that are more advanced than those currently employed in IR. These advanced algorithms need to exploit the structure of, and the interrelationships among Web pages.From a research point of view, we need also to re-think evaluation because of the different characteristics of Web IR, which can be expressed in terms of data, functionalities, architecture, and tools. These characteristics affect 'how' to carry evaluation out and 'what' to evaluate.This chapter faces the different aspects of IR on the Web that can be considered and analysed, that is: history of IR on the Web, different types of tools for performing IR on the Web which have been designed and developed to answer different user requirements, architecture and components of those IR Web tools, indexing and retrieval algorithms that can be employed for making Web IR effective, and methods for evaluation of Web IR.