Protection Techniques from Information Extraction

  • Authors:
  • Gianluigi Greco;Giovambattista Ianni;Vincenzino Lio;Luigi Palopoli

  • Affiliations:
  • Universita della Calabria, Italy;Universita della Calabria, Italy;Universita della Calabria, Italy;Universita della Calabria, Italy

  • Venue:
  • WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Information extraction technologies meet the market need for automatic tools for extracting semi-structured information from web pages. However, pages may change over time due to different reasons, ranging from restyling pages to on-purpose modifications brought about into pages in order to puzzle Web wrappers. In this paper we deal with this latter scenario, by studying the issue of on-purpose wrapper spoiling and its relationship to wrapping. We present an architecture and a tool implementing a wrapper spoiling system, and discuss some practical spoiling techniques which are also experimentally tested.