Ontology extraction and conceptual modeling for web information

  • Authors:
  • Hyoil Han;Ramez Elmasri

  • Affiliations:
  • The University of Texas at Arlington;The University of Texas at Arlington

  • Venue:
  • Information modeling for internet applications
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

A lot of work has been done in the area of extracting data content from the Web, but less attention has been given to extracting the conceptual schemas or ontologies of underlying Web pages. The goal of the WebOntEx (Web ontology extraction) project is to make progress toward semiautomatically extracting Web ontologies by analyzing a set of Web pages that are in the same application domain. The ontology is considered a complete schema of the domain concepts. Our ontology metaconcepts are based on the extended entity-relationship (EER) model. The concepts are classified into entity types, relationships, attributes, and superclass/ subclass hierarchies. WebOntEx attempts to extract ontology concepts by analyzing the use of HTML tags and by utilizing Part-of-Speech tagging. WebOntEx applies heuristic rules and machine learning techniques, in particular, inductive logic programming (ILP).