Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Web classification using support vector machine
Proceedings of the 4th international workshop on Web information and data management
Hypertext Categorization using Hyperlink Patterns and Meta Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Learning the Kernel Matrix with Semidefinite Programming
The Journal of Machine Learning Research
A comparison of implicit and explicit links for web page classification
Proceedings of the 15th international conference on World Wide Web
Web page classification: Features and algorithms
ACM Computing Surveys (CSUR)
Automatic Web Page Classification Using Various Features
PCM '08 Proceedings of the 9th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Hypertext Classification Using Tensor Space Model and Rough Set Based Ensemble Classifier
PReMI '09 Proceedings of the 3rd International Conference on Pattern Recognition and Machine Intelligence
Tensor Framework and Combined Symmetry for Hypertext Mining
Fundamenta Informaticae
IDEAL'07 Proceedings of the 8th international conference on Intelligent data engineering and automated learning
Multi-network fusion for collective inference
Proceedings of the Eighth Workshop on Mining and Learning with Graphs
A novel split and merge technique for hypertext classification
Transactions on rough sets XII
Tensor Framework and Combined Symmetry for Hypertext Mining
Fundamenta Informaticae
Multi-source learning with block-wise missing data for Alzheimer's disease prediction
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
Web pages are more than text and they contain much contextual and structural information, e.g., the title, the meta data, the anchor text,etc., each of which can be seen as a data source or are presentation. Due to the different dimensionality and different representing forms of these heterogeneous data sources, simply putting them together would not greatly enhance the classification performance. We observe that via a kernel function, different dimensions and types of data sources can be represented into acommon format of kernel matrix, which can be seen as a generalized similarity measure between a pair of web pages. In this sense, a kernel learning approach is employed to fuse these heterogeneous data sources. The experimental results on a collection of the ODP database validate the advantages of the proposed method over traditional methods based on any single data source and the uniformly weighted combination of them.