Finding Outlying Items in Sets of Partial Rankings

Authors:
Antti Ukkonen;Heikki Mannila
Affiliations:
Helsinki University of Technology, and Helsinki Institute for Information Technology,;Helsinki University of Technology, and University of Helsinki, and Helsinki Institute for Information Technology,
Venue:
PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Year:
2007

Citing 2
Cited 2

Assessing data mining results via swap randomization

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient Clustering for Orders

ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops

Finding Total and Partial Orders from Data for Seriation

DS '08 Proceedings of the 11th International Conference on Discovery Science
Clustering Algorithms for Chains

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Partial rankings are totally ordered subsets of a set of items. For example, the sequence in which a user browses through different parts of a website is a partial ranking. We consider the following problem. Given a set Dof partial rankings, find items that have strongly different status in different parts of D. To do this, we first compute a clustering of Dand then look at items whose average rank in the cluster substantially deviates from its average rank in D. Such items can be seen as those that contribute the most to the differences between the clusters. To test the statistical significance of the found items, we propose a method that is based on a MCMC algorithm for sampling random sets of partial rankings with exactly the same statistics as D. We also demonstrate the method on movie rankings and gene expression data.