Data Extraction and Annotation for Dynamic Web Pages

  • Authors:
  • Hui Song;Suraj Giri;Fanyuan Ma

  • Affiliations:
  • -;-;-

  • Venue:
  • EEE '04 Proceedings of the 2004 IEEE International Conference on e-Technology, e-Commerce and e-Service (EEE'04)
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many Web sites contain large sets of pages generateddynamically using a common template. The structured dataextracted from these pages with semantic annotation arevaluable for information system. In this paper, we proposeda system, ADeaD, to automatically extract data valuesfrom these Web pages and annotate the data schema.Experimental evaluation on a lot of real Web pagecollections indicates our algorithm correctly extracteddata and annotated the data schema.