Extracting and modeling the semantic information content of web documents to support semantic document retrieval

Authors:
Shahrul Azman Noah;Lailatulqadri Zakaria;Arifah Che Alhadi
Affiliations:
Universiti Kebangsaan Malaysia, UKM Bangi Selangor, Malaysia;Universiti Kebangsaan Malaysia, UKM Bangi Selangor, Malaysia;Universiti Malaysia Terengganu, Terengganu, Malaysia
Venue:
APCCM '09 Proceedings of the Sixth Asia-Pacific Conference on Conceptual Modeling - Volume 96
Year:
2009

Citing 18
Cited 0

A comparative analysis of methodologies for database schema integration

ACM Computing Surveys (CSUR)
Enabling technology for knowledge sharing

AI Magazine
A translation approach to portable ontology specifications

Knowledge Acquisition - Special issue: Current issues in knowledge modeling
WordNet: a lexical database for English

Communications of the ACM
Information extraction

Communications of the ACM
Conceptual-model-based data extraction from multiple-record Web pages

Data & Knowledge Engineering
Natural language analysis for semantic document modeling

Data & Knowledge Engineering
Automatic Ontology-Based Knowledge Extraction from Web Documents

IEEE Intelligent Systems
MnM: Ontology Driven Semi-automatic and Automatic Support for Semantic Markup

EKAW '02 Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management. Ontologies and the Semantic Web
A Conceptual Modeling Approach to Semantic Document Retrieval

CAiSE '02 Proceedings of the 14th International Conference on Advanced Information Systems Engineering
Ontological Engineering

Ontological Engineering
Toward semantic understanding: an approach based on information extraction ontologies

ADC '04 Proceedings of the 15th Australasian database conference - Volume 27
Annotating the semantic web using natural language

NLPXML '02 Proceedings of the 2nd workshop on NLP and XML - Volume 17
A framework for web science

Foundations and Trends in Web Science
A semantic retrieval of web documents using domain ontology

International Journal of Web and Grid Services
The automatic creation of literature abstracts

IBM Journal of Research and Development
Semantic annotation for knowledge management: Requirements and a survey of the state of the art

Web Semantics: Science, Services and Agents on the World Wide Web
A controlled natural language layer for the semantic web

AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Existing HTML mark-up is used only to indicate the structure and lay-out of documents, but not the document semantics. As a result web documents are difficult to be semantically processed, retrieved and explored by computer applications. Existing information extraction system mainly concerns with extracting important keywords or key phrases that represent the content of the documents. The semantic aspects of such keywords have not been explored extensively. In this paper we propose an approach meant to assist in extracting and modeling the semantic information content of web documents using natural language analysis technique and a domain specific ontology. Together with the user's participation, the tool gradually extracts and constructs the semantic document model which is represented as XML. The semantic models representing each document are then being integrated to form a global semantic model. Such a model provides users with a global knowledge model of some domains.