Web Data Cleansing and Preparation for Ontology Extraction Using WordNet

Authors:
Keng-Woei Tan;Hyoil Han;Ramez Elmasri
Affiliations:
-;-;-
Venue:
WISE '00 Proceedings of the First International Conference on Web Information Systems Engineering (WISE'00)-Volume 2 - Volume 2
Year:
2000

Citing 4
Cited 3

Extracting schema from semistructured data

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Database techniques for the World-Wide Web: a survey

ACM SIGMOD Record
What Are Ontologies, and Why Do We Need Them?

IEEE Intelligent Systems
Querying Semi-Structured Data

ICDT '97 Proceedings of the 6th International Conference on Database Theory

Using recursive ART network to construction domain ontology based on term frequency and inverse document frequency

Expert Systems with Applications: An International Journal
Enhancement of domain ontology construction using a crystallizing approach

Expert Systems with Applications: An International Journal
Repurposing social tagging data for extraction of domain-level concepts

NLDB'11 Proceedings of the 16th international conference on Natural language processing and information systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The explosive growth of data on the web makes information management and knowledge discovery increasingly difficult. Applying database techniques to manage web information can help in solving these problems. One difficulty encountered is that web documents, unlike structured databases, contain unstructured and semi-structured data. Our hypothesis is that creating ontologies to describe the semantics of web data is the key to bridging the gap between semi-structured data and structured databases, and hence facilitating the application of database techniques. We extract an ontology (or conceptual schema) from a set of web pages in a particular application domain automatically. The prototype we are constructing is called WebOntEx for Web Ontology Extraction. This paper describes the data preparation process and the semantic resolution process of the WebOntEx project to build meta-database and web database.