Independent component analysis, a new concept?
Signal Processing - Special issue on higher order statistics
The nature of statistical learning theory
The nature of statistical learning theory
Information Retrieval
Sparse bayesian learning and the relevance vector machine
The Journal of Machine Learning Research
Text classification with kernels on the multinomial manifold
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Discriminative Gaussian process latent variable model for classification
Proceedings of the 24th international conference on Machine learning
Inferring 3D body pose from silhouettes using activity manifold learning
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Supervised nonlinear dimensionality reduction for visualization and classification
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
An effective double-bounded tree-connected Isomap algorithm for microarray data classification
Pattern Recognition Letters
Hi-index | 0.00 |
The problem of overfitting arises frequently in text mining due to high dimensional feature spaces, making the task of the learning algorithms difficult. Moreover, in such spaces visualization is not feasible. We focus on supervised text classification by presenting an approach that uses prior information about training labels, manifold learning and Support Vector Machines (SVM). Manifold learning is herein used as a pre-processing step, which performs nonlinear dimension reduction in order to tackle the curse of dimensionality that occurs. We use Isomap (Isometric Mapping) which allows text to be embedded in a low dimensional space, while enhancing the geometric characteristics of data by preserving the geodesic distance within the manifold. Finally, kernel-based machines can be used with benefits for final text classification in this reduced space. Results on a real-world benchmark corpus from Reuters demonstrate the visualization capabilities of the method in the severely reduced space. Furthermore we show the method yields performances comparable to those obtained with single kernel-based machines.