Ontology creation: extraction of domain knowledge from web documents

  • Authors:
  • Veda C. Storey;Roger Chiang;G. Lily Chen

  • Affiliations:
  • Department of Computer Information Systems, J. Mack Robinson College of Business, Georgia State University, Atlanta, GA;Information Systems Department, College of Business, University of Cincinnati, Cincinnati, Ohio;Department of Computer Information Systems, J. Mack Robinson College of Business, Georgia State University, Atlanta, GA

  • Venue:
  • ER'05 Proceedings of the 24th international conference on Conceptual Modeling
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Considerable research has gone into developing ontologies and applying them to a variety of applications. The extraction of domain knowledge for developing these ontologies is often performed on a manual basis. The World Wide Web contains a wealth of knowledge about an application domain; however it is embedded within web pages. This research presents a methodology for semi-automatically extracting knowledge from the World Wide Web and organizing it into domain ontologies. Initial semantics of a target domain are provided by a set of keywords. From these, web pages are identified that contain relevant information for the subject domain using search engines. Web data extraction techniques are employed to extract information from these web pages and infer how the information is related. Extracted knowledge is then organized into a domain ontology. Testing of the methodology on various application domains illustrates the feasibility of the approach.