Text categorization for multiple users based on semantic features from a machine-readable dictionary
ACM Transactions on Information Systems (TOIS)
Machine Learning
Embedding knowledge in Web documents
WWW '99 Proceedings of the eighth international conference on World Wide Web
Ontology-focused crawling of Web documents
Proceedings of the 2003 ACM symposium on Applied computing
Web page classification without the web page
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
An adaptive k-nearest neighbor text categorization strategy
ACM Transactions on Asian Language Information Processing (TALIP)
DCMI '02 Proceedings of the 2002 international conference on Dublin core and metadata applications: Metadata for e-communities: supporting diversity and convergence
Hi-index | 0.00 |
Traditional methods of documents classification need characteristic abstraction and classifier training. The work of collecting trainable text terms is laborious and time-consuming. Additionally, it is difficult to abstract the characteristics from Chinese documents. In order to solve the problem, this paper proposes an ontology-based approach to improve the efficiency and effectiveness of web documents classification and retrieval. Firstly, the approach establishes an ontology model based on Hownet[6] kownledge base and its method. Then, it creates ontologies for each subclass of the classification system. It uses RDFS to convert Hownet into ontology and to define the relations among ontologies. The web documents classification is performed automatically using the ontology relevance calculating algorithm. Comparing with the method of KNN[2], the results of our experiments indicate that the accuracy of ontology-based approach is close to KNN, its algorithms is more robust than KNN, and its recalling rate is better than KNN.