Ontology-Based Automatic Classification and Ranking for Web Documents

  • Authors:
  • Jun Fang;Lei Guo;XiaoDong Wang;Ning Yang

  • Affiliations:
  • Northwestern Polytechnical University, China;Northwestern Polytechnical University, China;Northwestern Polytechnical University, China;Northwestern Polytechnical University, China

  • Venue:
  • FSKD '07 Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 03
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The process of web document classification involves calculating similarities between documents and categories by using the information extracted from them. In recent years, ontology-based web documents classification method is introduced to solve the problem of classifier training and not considering semantic relations between words in traditional Machine Learning algorithms. However, previous works on ontology-based web documents classification miss some important issues of automatic ontology construction and ranking of classified documents. In order to solve these problems, this paper proposes an ontology-based web documents classification and ranking method. Firstly, weighted terms set are extracted from web documents, and ontology is build up by using an effective ontology construction method which clarifies and augments an existent ontology; then similarity score between documents and ontology is computed based on WordNet by using Earth Mover's Distance (EMD) method; finally, web documents are assigned to categories according to the similarity score, and a simple ranking method is used to sort the documents in the same categories. The experiment result shows our classification algorithm achieves better precision and recall compare with adaptive KNN method, and is competitive with SVM method, the ranking method also has good performance.