Enumerative combinatorics
Algorithms for clustering data
Algorithms for clustering data
Elements of information theory
Elements of information theory
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
An experimental comparison of model-based clustering methods
Machine Learning
Clustering Algorithms
Performance criteria for graph clustering and Markov cluster experiments
Performance criteria for graph clustering and Markov cluster experiments
Comparing clusterings: an axiomatic view
ICML '05 Proceedings of the 22nd international conference on Machine learning
Least squares quantization in PCM
IEEE Transactions on Information Theory
A study of Java object demographics
Proceedings of the 7th international symposium on Memory management
Model-based document clustering with a collapsed gibbs sampler
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Identification of association rules between clusters
CSTST '08 Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology
Clustering of document collection - A weighting approach
Expert Systems with Applications: An International Journal
Visual diversification of image search results
Proceedings of the 18th international conference on World wide web
RECOMB 2'09 Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology
An Experimental Validation of Some Indexes of Fuzzy Clustering Similarity
WILF '09 Proceedings of the 8th International Workshop on Fuzzy Logic and Applications
The NVI clustering evaluation measure
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
A Fast Approximation Algorithm for the k Partition-Distance Problem
ICCSA '09 Proceedings of the International Conference on Computational Science and Its Applications: Part II
An Approach to Web-Scale Named-Entity Disambiguation
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Spectral preprocessing for clustering time-series gene expressions
EURASIP Journal on Bioinformatics and Systems Biology - Special issue on applications of signal procesing techniques to bioinformatics, genomics, and proteomics
Bounding and comparing methods for correlation clustering beyond ILP
ILP '09 Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing
Robustness of emerged community in social network
Proceedings of the International Conference on Management of Emergent Digital EcoSystems
Creating a gold standard for sentence clustering in multi-document summarization
ACLstudent '09 Proceedings of the ACL-IJCNLP 2009 Student Research Workshop
The infinite HMM for unsupervised PoS tagging
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Unsupervised and constrained Dirichlet process mixture models for verb clustering
GEMS '09 Proceedings of the Workshop on Geometrical Models of Natural Language Semantics
Unsupervised Object Discovery: A Comparison
International Journal of Computer Vision
Linkage tree genetic algorithm: first results
Proceedings of the 12th annual conference companion on Genetic and evolutionary computation
Improved unsupervised POS induction through prototype discovery
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Using sentence type information for syntactic category acquisition
CMCL '10 Proceedings of the 2010 Workshop on Cognitive Modeling and Computational Linguistics
Clustering of vehicle trajectories
IEEE Transactions on Intelligent Transportation Systems
Active learning for constrained Dirichlet process mixture models
GEMS '10 Proceedings of the 2010 Workshop on GEometrical Models of Natural Language Semantics
Improved unsupervised POS induction using intrinsic clustering quality and a Zipfian constraint
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Type level clustering evaluation: new measures and a POS induction case study
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Crouching Dirichlet, hidden Markov model: unsupervised POS tagging with context local tag generation
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Evaluating models of latent document semantics in the presence of OCR errors
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Exploring the performance limit of cluster ensemble techniques
SSPR&SPR'10 Proceedings of the 2010 joint IAPR international conference on Structural, syntactic, and statistical pattern recognition
Evaluating adversarial partitions
ESORICS'10 Proceedings of the 15th European conference on Research in computer security
Gaussian clusters and noise: an approach based on the minimum description length principle
DS'10 Proceedings of the 13th international conference on Discovery science
Learning a nonlinear distance metric for supervised region-merging image segmentation
Computer Vision and Image Understanding
The Journal of Machine Learning Research
Image segmentation fusion using general ensemble clustering methods
ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part IV
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Summarizing decisions in spoken meetings
WASDGML '11 Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages
Tuning graded possibilistic clustering by visual stability analysis
WILF'11 Proceedings of the 9th international conference on Fuzzy logic and applications
Non-metric multidimensional scaling for privacy-preserving data clustering
IDEAL'11 Proceedings of the 12th international conference on Intelligent data engineering and automated learning
Controlling complexity in part-of-speech induction
Journal of Artificial Intelligence Research
Detecting communities in sparse MANETs
IEEE/ACM Transactions on Networking (TON)
Data Mining and Knowledge Discovery
Evaluating unsupervised learning for natural language processing tasks
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
ISBRA'10 Proceedings of the 6th international conference on Bioinformatics Research and Applications
Feature selection using structural similarity
Information Sciences: an International Journal
A Metric for Phylogenetic Trees Based on Matching
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Adaptive Bayesian HMM for Fully Unsupervised Chinese Part-of-Speech Induction
ACM Transactions on Asian Language Information Processing (TALIP)
Tripartite community structure in social bookmarking data
The New Review of Hypermedia and Multimedia - Special issue on Social Linking and Hypermedia
Dissimilarity and similarity measures for comparing dendrograms and their applications
Advances in Data Analysis and Classification
Text categorization using an ensemble classifier based on a mean co-association matrix
MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
A comparative study of efficient initialization methods for the k-means clustering algorithm
Expert Systems with Applications: An International Journal
Incorporating lexical priors into topic models
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Proceedings of the Workshop on Semantic Analysis in Social Media
Enhanced clustering of complex database objects in the clustcube framework
Proceedings of the fifteenth international workshop on Data warehousing and OLAP
Globally optimal closed-surface segmentation for connectomics
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
VAMO: towards a fully automated malware clustering validity analysis
Proceedings of the 28th Annual Computer Security Applications Conference
Is standard multivariate analysis sufficient in clinical and epidemiological studies?
Journal of Biomedical Informatics
Analyzing the flow of knowledge in computer mediated teams
Proceedings of the 2013 conference on Computer supported cooperative work
On inconsistencies in quantifying strength of community structures
AusDM '08 Proceedings of the 7th Australasian Data Mining Conference - Volume 87
A computational model of logical metonymy
ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 2
Clustering memes in social media
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Optimising sum-of-squares measures for clustering multisets defined over a metric space
Discrete Applied Mathematics
Fast cartography for data explorers
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
This paper proposes an information theoretic criterion for comparing two partitions, or clusterings, of the same data set. The criterion, called variation of information (VI), measures the amount of information lost and gained in changing from clustering C to clustering C^'. The basic properties of VI are presented and discussed. We focus on two kinds of properties: (1) those that help one build intuition about the new criterion (in particular, it is shown the VI is a true metric on the space of clusterings), and (2) those that pertain to the comparability of VI values over different experimental conditions. As the latter properties have rarely been discussed explicitly before, other existing comparison criteria are also examined in their light. Finally we present the VI from an axiomatic point of view, showing that it is the only ''sensible'' criterion for comparing partitions that is both aligned to the lattice and convexely additive. As a consequence, we prove an impossibility result for comparing partitions: there is no criterion for comparing partitions that simultaneously satisfies the above two desirable properties and is bounded.