Semantically Accessing Documents Using Conceptual Model Descriptions

  • Authors:
  • Terje Brasethvik;Jon Atle Gulla

  • Affiliations:
  • -;-

  • Venue:
  • ER '99 Proceedings of the Workshops on Evolution and Change in Data Management, Reverse Engineering in Information Systems, and the World Wide Web and Conceptual Modeling
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

When publishing documents on the Web, the user needs to describe and classify her documents for the benefit of later retrieval and use. This paper presents an approach to semantic document classification and retrieval based on Natural Language Processing and Conceptual Modeling. The Referent Model language is used in combination with a lexical analysis tool to define a controlled vocabulary for classifying documents. Documents are classified by means of sentences that contain the high frequency words in the document that also occur in the domain model defining the vocabulary. The sentences are parsed using a DCG-like grammar, mapped into a Referent Model fragment and stored along with the document using RDF-XML syntax. The model fragment represents the connection between the document and the domain model and serves as a document index. The approach is being implemented for a document collection published by the Norwegian Center for Medical Informatics (KITH).