Information Extraction from Web Pages

  • Authors:
  • Róbert Novotny;Peter Vojtas;Dušan Maruscak

  • Affiliations:
  • -;-;-

  • Venue:
  • WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a chain of techniques for extraction of object attribute data from web pages which contain either multiple object data or detailed data about a single object. We discover data regions containing multiple data records, which will be extracted with help of extraction ontology. Furthermore, we present an additional algorithm for detail-page extraction based on the comparison of two HTML subtrees.