Web Data Cleansing and Preparation for Ontology Extraction Using WordNet

  • Authors:
  • Keng-Woei Tan;Hyoil Han;Ramez Elmasri

  • Affiliations:
  • -;-;-

  • Venue:
  • WISE '00 Proceedings of the First International Conference on Web Information Systems Engineering (WISE'00)-Volume 2 - Volume 2
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

The explosive growth of data on the web makes information management and knowledge discovery increasingly difficult. Applying database techniques to manage web information can help in solving these problems. One difficulty encountered is that web documents, unlike structured databases, contain unstructured and semi-structured data. Our hypothesis is that creating ontologies to describe the semantics of web data is the key to bridging the gap between semi-structured data and structured databases, and hence facilitating the application of database techniques. We extract an ontology (or conceptual schema) from a set of web pages in a particular application domain automatically. The prototype we are constructing is called WebOntEx for Web Ontology Extraction. This paper describes the data preparation process and the semantic resolution process of the WebOntEx project to build meta-database and web database.