Ontology-based HTML to XML conversion

  • Authors:
  • Shijun Li;Weijie Ou;Junqing Yu

  • Affiliations:
  • School of Computer, Wuhan University, Wuhan, China;School of Computer, Wuhan University, Wuhan, China;College of Computer Science & Technology, Huazhong University of Science & Technology, Wuhan, China

  • Venue:
  • WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Current wrapper approaches break down in extracting data from differently structured and frequently changing Web pages. To tackle this challenge, this paper defines domain-specific ontology, captures the semantic hierarchy in Web pages automatically by exploiting both structural information and common formatting information, and recognizes and extracts data by using ontology-based semantic matching without relying on page-specific formatting. It is adaptive to differently structured and frequently changing Web pages for a domain of interest.