Ontology extraction and conceptual modeling for web information

Authors:
Hyoil Han;Ramez Elmasri
Affiliations:
The University of Texas at Arlington;The University of Texas at Arlington
Venue:
Information modeling for internet applications
Year:
2003

Citing 15
Cited 1

CYC: a large-scale investment in knowledge infrastructure

Communications of the ACM
NoDoSE—a tool for semi-automatically extracting structured and semistructured data from text documents

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Learning Information Extraction Rules for Semi-Structured and Free Text

Machine Learning - Special issue on natural language learning
Snowball: extracting relations from large plain-text collections

DL '00 Proceedings of the fifth ACM conference on Digital libraries
Converting the syntactic structures of hierarchical data to their semantic structures

Information organization and databases
Machine Learning

Machine Learning
Foundations of Inductive Logic Programming

Foundations of Inductive Logic Programming
Fundamentals of Database Systems

Fundamentals of Database Systems
Modern Information Retrieval

Modern Information Retrieval
Schema versioning and database conversion techniques for bi-temporal databases

Annals of Mathematics and Artificial Intelligence
Relational Learning with Statistical Predicate Invention: Better Models for Hypertext

Machine Learning
The Semantic Web: The Roles of XML and RDF

IEEE Internet Computing
Querying Semi-Structured Data

ICDT '97 Proceedings of the 6th International Conference on Database Theory
Extracting Patterns and Relations from the World Wide Web

WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Semi-Automatic Wrapper Generation for Internet Information Sources

COOPIS '97 Proceedings of the Second IFCIS International Conference on Cooperative Information Systems

Towards ontology learning from folksonomies

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

A lot of work has been done in the area of extracting data content from the Web, but less attention has been given to extracting the conceptual schemas or ontologies of underlying Web pages. The goal of the WebOntEx (Web ontology extraction) project is to make progress toward semiautomatically extracting Web ontologies by analyzing a set of Web pages that are in the same application domain. The ontology is considered a complete schema of the domain concepts. Our ontology metaconcepts are based on the extended entity-relationship (EER) model. The concepts are classified into entity types, relationships, attributes, and superclass/ subclass hierarchies. WebOntEx attempts to extract ontology concepts by analyzing the use of HTML tags and by utilizing Part-of-Speech tagging. WebOntEx applies heuristic rules and machine learning techniques, in particular, inductive logic programming (ILP).