Improving search results with data mining in a thematic search engine

  • Authors:
  • M. Caramia;G. Felici;A. Pezzoli

  • Affiliations:
  • Istituto per le Applicazioni del Calcolo IAC-CNR, Viale del Policlinico, 137, 00161 Roma, Italy;Istituto di Analisi dei Sistemi ed Informatica IASI-CNR, Viale Manzoni 30, 00185 Roma, Italy;Istituto di Analisi dei Sistemi ed Informatica IASI-CNR, Viale Manzoni 30, 00185 Roma, Italy

  • Venue:
  • Computers and Operations Research
  • Year:
  • 2004

Quantified Score

Hi-index 0.01

Visualization

Abstract

The problem of obtaining relevant results in web searching has been tackled with several approaches. Although very effective techniques are currently used by the most popular search engines when no a priori knowledge on the user's desires beside the search keywords is available, in different settings it is conceivable to design search methods that operate on a thematic database of web pages that refer to a common body of knowledge or to specific sets of users. We have considered such premises to design and develop a search method that deploys data mining and optimization techniques to provide a more significant and restricted set of pages as the final result of a user search. We adopt a vectorization method based on search context and user profile to apply clustering techniques that are then refined by a specially designed genetic algorithm. In this paper we describe the method, its implementation, the algorithms applied, and discuss some experiments that has been run on test sets of web pages.