Nonlinear component analysis as a kernel eigenvalue problem
Neural Computation
Advances in Large Margin Classifiers
Advances in Large Margin Classifiers
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
A Kernel Approach for Learning from almost Orthogonal Patterns
ECML '02 Proceedings of the 13th European Conference on Machine Learning
Iterative Clustering of High Dimensional Text Data Augmented by Local Search
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
The Journal of Machine Learning Research
An Extended Kernel for Generalized Multiple-Instance Learning
ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
Protein homology detection using string alignment kernels
Bioinformatics
A spectral approach to clustering numerical vectors as nodes in a network
Pattern Recognition
Efficient prediction-based validation for document clustering
ECML'06 Proceedings of the 17th European conference on Machine Learning
A multi-classifier system for text categorization
Proceedings of the 2011 ACM Symposium on Research in Applied Computation
Positional and confidence voting-based consensus functions for fuzzy cluster ensembles
Fuzzy Sets and Systems
Unsupervised graph-based topic labelling using dbpedia
Proceedings of the sixth ACM international conference on Web search and data mining
Hi-index | 0.00 |
In supervised kernel methods, it has been observed that the performance of the SVM classifier is poor in cases where the diagonal entries of the Gram matrix are large relative to the off-diagonal entries. This problem, referred to as diagonal dominance, often occurs when certain kernel functions are applied to sparse high-dimensional data, such as text corpora. In this paper we investigate the implications of diagonal dominance for unsupervised kernel methods, specifically in the task of document clustering. We propose a selection of strategies for addressing this issue, and evaluate their effectiveness in producing more accurate and stable clusterings.