Concept decompositions for large sparse text data using clustering
Machine Learning
Fast Kernel Classifiers with Online and Active Learning
The Journal of Machine Learning Research
Topic and Trend Detection in Text Collections Using Latent Dirichlet Allocation
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Hi-index | 0.00 |
Traditional clustering algorithms work on "flat" data, making the assumption that the data instances can only be represented by a set of homogeneous and uniform features. Many real world data, however, is heterogeneous in nature, comprising of multiple types of interrelated components. We present a clustering algorithm, K-SVMeans, that integrates the well known K-Means clustering with the highly popular Support Vector Machines(SVM) in order to utilize the richness of data. Our experimental results on authorship analysis of scientific publications show that K-SVMeans achieves better clustering performance than homogeneous data clustering.