A clustering method for web data with multi-type interrelated components

Authors:
Levent Bolelli;Seyda Ertekin;Ding Zhou;C. Lee Giles
Affiliations:
Pennsylvania State University;Pennsylvania State University;Pennsylvania State University;Pennsylvania State University
Venue:
Proceedings of the 16th international conference on World Wide Web
Year:
2007

Citing 2
Cited 1

Concept decompositions for large sparse text data using clustering

Machine Learning
Fast Kernel Classifiers with Online and Active Learning

The Journal of Machine Learning Research

Topic and Trend Detection in Text Collections Using Latent Dirichlet Allocation

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditional clustering algorithms work on "flat" data, making the assumption that the data instances can only be represented by a set of homogeneous and uniform features. Many real world data, however, is heterogeneous in nature, comprising of multiple types of interrelated components. We present a clustering algorithm, K-SVMeans, that integrates the well known K-Means clustering with the highly popular Support Vector Machines(SVM) in order to utilize the richness of data. Our experimental results on authorship analysis of scientific publications show that K-SVMeans achieves better clustering performance than homogeneous data clustering.