Automatic web page annotation with google rich snippets

  • Authors:
  • Walter Hop;Stephan Lachner;Flavius Frasincar;Roberto De Virgilio

  • Affiliations:
  • Erasmus University Rotterdam, Erasmus School of Economics, Rotterdam, The Netherlands;Erasmus University Rotterdam, Erasmus School of Economics, Rotterdam, The Netherlands;Erasmus University Rotterdam, Erasmus School of Economics, Rotterdam, The Netherlands;Dipartimento di Informatica e Automazione, Universitá Roma Tre, Rome, Italy

  • Venue:
  • OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems: Part II
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Web pages are designed to be read by people, not machines. Consequently, searching and reusing information on the Web is a difficult task without human participation. Adding semantics (i.e meaning) to a Web page would help machines to understand Web contents and better support the Web search process. One of the latest developments in this field is Google's Rich Snippets, a service for Web site owners to add semantics to their Web pages. In this paper we provide an approach to automatically annotate a Web page with Rich Snippets RDFa tags. Exploiting several heuristics and a named entity recognition technique, our method is capable of recognizing and annotating a subset of Rich Snippets' vocabulary, i.e., all attributes of its Review concept, and the names of Person and Organization concepts. We implemented an on-line service and evaluated the accuracy of the approach on real E-commerce Web sites.