Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
An experimental comparison of model-based clustering methods
Machine Learning
An Adaptive Meta-Clustering Approach: Combining the Information from Different Clustering Results
CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Performance criteria for graph clustering and Markov cluster experiments
Performance criteria for graph clustering and Markov cluster experiments
Information-theoretic co-clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Entity-based cross-document coreferencing using the Vector Space Model
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Combining distributional and morphological information for part of speech induction
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Building a large-scale annotated Chinese corpus
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
The unsupervised learning of natural language structure
The unsupervised learning of natural language structure
Unsupervised and semi-supervised learning of tone and pitch accent
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Comparing clusterings---an information based distance
Journal of Multivariate Analysis
Model-based document clustering with a collapsed gibbs sampler
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Characterization and evaluation of similarity measures for pairs of clusterings
Knowledge and Information Systems
Evaluating unsupervised part-of-speech tagging for grammar induction
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Unsupervised induction of labeled parse trees by clustering with syntactic features
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Improved unsupervised POS induction through prototype discovery
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
SVD and clustering for unsupervised POS tagging
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Improved unsupervised POS induction using intrinsic clustering quality and a Zipfian constraint
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Type level clustering evaluation: new measures and a POS induction case study
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Controlling complexity in part-of-speech induction
Journal of Artificial Intelligence Research
Evaluating unsupervised learning for natural language processing tasks
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Scalable multi stage clustering of tagged micro-messages
Proceedings of the 21st international conference companion on World Wide Web
Clustering short text and its evaluation
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
Entity clustering across languages
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
MaxMax: a graph-based soft clustering algorithm applied to word sense induction
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Hi-index | 0.00 |
Clustering is crucial for many NLP tasks and applications. However, evaluating the results of a clustering algorithm is hard. In this paper we focus on the evaluation setting in which a gold standard solution is available. We discuss two existing information theory based measures, V and VI, and show that they are both hard to use when comparing the performance of different algorithms and different datasets. The V measure favors solutions having a large number of clusters, while the range of scores given by VI depends on the size of the dataset. We present a new measure, NVI, which normalizes VI to address the latter problem. We demonstrate the superiority of NVI in a large experiment involving an important NLP application, grammar induction, using real corpus data in English, German and Chinese.