Automatic Classification of Web Pages based on the Concept of Domain Ontology

Authors:
Mu-Hee Song;Soo-Yeon Lim;Dong-Jin Kang;Sang-Jo Lee
Affiliations:
Kyungpook National University, Korea;Kyungpook National University, Korea;Kyungpook National University, Korea;Kyungpook National University, Korea
Venue:
APSEC '05 Proceedings of the 12th Asia-Pacific Software Engineering Conference
Year:
2005

Citing 0
Cited 4

Classifying Web Pages by Using Knowledge Bases for Entity Retrieval

DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Topical categorization of search results based on a domain ontology

ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Automatic web pages hierarchical classification using dynamic domain ontologies

International Journal of Knowledge and Web Intelligence
Crawling the web with OntoDir

DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

The use of ontology in order to provide a mechanism to enable machine reasoning has continuously increased during the last few years. This paper suggests an automated method for document classification using an ontology, which expresses terminology information and vocabulary contained in Web documents by way of a hierarchical structure. Ontology-based document classification involves determining document features that represent the Web documents most accurately, and classifying them into the most appropriate categories after analyzing their contents by using at least two predefined categories per given document features. In this paper, Web pages are classified in real time not with experimental data or a learning process, but by similar calculations between the terminology information extracted from Web pages and ontology categories. This results in a more accurate document classification since the meanings and relationships unique to each document are determined.