Clustering Algorithms for Chains

Authors:
Antti Ukkonen
Affiliations:
-
Venue:
The Journal of Machine Learning Research
Year:
2011

Citing 18
Cited 0

Rank aggregation methods for the Web

Proceedings of the 10th international conference on World Wide Web
Algorithms for graph partitioning on the planted partition model

Random Structures & Algorithms
Principles of data mining

Principles of data mining
Discovering local structure in gene expression data: the order-preserving submatrix problem

Proceedings of the sixth annual international conference on Computational biology
Mixtures of distance-based models for ranking data

Computational Statistics & Data Analysis
Improved Algorithms for the Random Cluster Graph Model

SWAT '02 Proceedings of the 8th Scandinavian Workshop on Algorithm Theory
Introduction to Machine Learning (Adaptive Computation and Machine Learning)

Introduction to Machine Learning (Adaptive Computation and Machine Learning)
Aggregating inconsistent information: ranking and clustering

Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
Ordering by weighted number of wins gives a good ranking for weighted tournaments

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Efficient Clustering for Orders

ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
Cluster analysis of heterogeneous rank data

Proceedings of the 24th international conference on Machine learning
k-means++: the advantages of careful seeding

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Assessing data mining results via swap randomization

ACM Transactions on Knowledge Discovery from Data (TKDD)
Finding Outlying Items in Sets of Partial Rankings

PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Visualizing Incomplete and Partially Ranked Data

IEEE Transactions on Visualization and Computer Graphics
Visualizing sets of partial rankings

IDA'07 Proceedings of the 7th international conference on Intelligent data analysis
Kantorovich distances between rankings with applications to rank aggregation

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Survey of clustering algorithms

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the problem of clustering a set of chains to k clusters. A chain is a totally ordered subset of a finite set of items. Chains are an intuitive way to express preferences over a set of alternatives, as well as a useful representation of ratings in situations where the item-specific scores are either difficult to obtain, too noisy due to measurement error, or simply not as relevant as the order that they induce over the items. First we adapt the classical k-means for chains by proposing a suitable distance function and a centroid structure. We also present two different approaches for mapping chains to a vector space. The first one is related to the planted partition model, while the second one has an intuitive geometrical interpretation. Finally we discuss a randomization test for assessing the significance of a clustering. To this end we present an MCMC algorithm for sampling random sets of chains that share certain properties with the original data. The methods are studied in a series of experiments using real and artificial data. Results indicate that the methods produce interesting clusterings, and for certain types of inputs improve upon previous work on clustering algorithms for orders.