Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A Study of Approaches to Hypertext Categorization
Journal of Intelligent Information Systems
Data Mining and Knowledge Discovery Handbook
Data Mining and Knowledge Discovery Handbook
Fast webpage classification using URL features
Proceedings of the 14th ACM international conference on Information and knowledge management
Graph-based text classification: learn from your neighbors
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Rough set Based Ensemble Classifier forWeb Page Classification
Fundamenta Informaticae
Web page classification with heterogeneous data fusion
Proceedings of the 16th international conference on World Wide Web
Tensor Space Model for Hypertext Representation
ICIT '08 Proceedings of the 2008 International Conference on Information Technology
Link-Local features for hypertext classification
EWMF'05/KDO'05 Proceedings of the 2005 joint international conference on Semantics, Web and Mining
Hi-index | 0.00 |
As WWW grows at an increasing speed, a classifier targeted at hypertext has become in high demand. While document categorization is quite a mature, the issue of utilizing hypertext structure and hyperlinks has been relatively unexplored. In this paper, we introduce tensor space model for representing hypertext documents. We exploit the local-structure and neighborhood recommendation encapsulated in the proposed representation model. Instead of using the text on a page for representing features in a vector space model, we have used features on the page and neighborhood features to represent a hypertext document in a tensor space model. Tensor similarity measure is defined. We have demonstrated the use of rough set based ensemble classifier on proposed tensor space model. Experimental results of classification obtained by using our method outperform existing hypertext classification techniques.