Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
A Study of Approaches to Hypertext Categorization
Journal of Intelligent Information Systems
Vector space model of information retrieval: a reevaluation
SIGIR '84 Proceedings of the 7th annual international ACM SIGIR conference on Research and development in information retrieval
Multilinear Analysis of Image Ensembles: TensorFaces
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part I
Analysis of anchor text for web search
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Fast and accurate text classification via multiple linear discriminant projections
The VLDB Journal — The International Journal on Very Large Data Bases
Fast webpage classification using URL features
Proceedings of the 14th ACM international conference on Information and knowledge management
Higher-Order Web Link Analysis Using Multilinear Algebra
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Text Representation: From Vector to Tensor
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Graph-based text classification: learn from your neighbors
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Tensor space model for document analysis
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Beyond streams and graphs: dynamic tensor analysis
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Rough set Based Ensemble Classifier forWeb Page Classification
Fundamenta Informaticae
Web page classification with heterogeneous data fusion
Proceedings of the 16th international conference on World Wide Web
Tensor Space Models for Authorship Identification
SETN '08 Proceedings of the 5th Hellenic conference on Artificial Intelligence: Theories, Models and Applications
Improvement of HITS for topic-specific web crawler
ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I
Link-Local features for hypertext classification
EWMF'05/KDO'05 Proceedings of the 2005 joint international conference on Semantics, Web and Mining
Hi-index | 0.00 |
As web grows at an increasing speed, hypertext classification is becoming a necessity. While the literature on text categorization is quite mature, the issue of utilizing hypertext structure and hyperlinks has been relatively unexplored. In this paper, we introduce a novel split and merge technique for classification of hypertext documents. The splitting process is performed at the feature level by representing the hypertext features in a tensor space model. We exploit the local-structure and neighborhood recommendation encapsulated in the this representation model. The merging process is performed on multiple classifications obtained from split representation. A meta level decision system is formed by obtaining predictions of base level classifiers trained on different components of the tensor and actual category of the hypertext document. These individual predictions for each component of the tensor are subsequently combined to a final prediction using rough set based ensemble classifiers. Experimental results of classification obtained by using our method is marginally better than other existing hypertext classification techniques.