An Architecture for Finding Entities on the Web

  • Authors:
  • Gianluca Demartini;Claudiu S. Firan;Mihai Georgescu;Tereza Iofciu;Ralf Krestel;Wolfgang Nejdl

  • Affiliations:
  • -;-;-;-;-;-

  • Venue:
  • LA-WEB '09 Proceedings of the 2009 Latin American Web Congress (la-web 2009)
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent progress in research fields such as Information Extraction and Information Retrieval enables the creation of systems providing better search experiences to web users. For example, systems that retrieve entities instead of just documents have been built. In this paper we present an approach for large-scale Entity Retrieval using web collections as underlying corpus. We propose an architecture for entity extraction and entity ranking starting from web documents. This is obtained (1) using an existing web document index and (2) creating an entity centric index. We describe advantages and feasibility of our approach using state-of-the-art tools.