Automatically building probabilistic databases from the web

  • Authors:
  • Lorenzo Blanco;Mirko Bronzi;Valter Crescenzi;Paolo Merialdo;Paolo Papotti

  • Affiliations:
  • Università Roma Tre, Roma, Italy;Università Roma Tre, Roma, Italy;Università Roma Tre, Roma, Italy;Università Roma Tre, Roma, Italy;Università Roma Tre, Roma, Italy

  • Venue:
  • Proceedings of the 20th international conference companion on World wide web
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A relevant number of web sites publish structured data about recognizable concepts (such as stock quotes, movies, restau- rants, etc.). There is a great chance to create applications that rely on a huge amount of data taken from the Web. We present an automatic and domain independent system that performs all the steps required to benefit from these data: it discovers data intensive web sites containing information about an entity of interest, extracts and integrate the published data, and finally performs a probabilistic analysis to characterize the impreciseness of the data and the accuracy of the sources. The results of the processing can be used to populate a probabilistic database.