Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
On Clustering Validation Techniques
Journal of Intelligent Information Systems
Efficiently Clustering Documents with Committees
PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Document clustering based on non-negative matrix factorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Relationship-based clustering and cluster ensembles for high-dimensional data mining
Relationship-based clustering and cluster ensembles for high-dimensional data mining
Entity-based cross-document coreferencing using the Vector Space Model
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
The impact of evaluation on multilingual text retrieval
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
The SemEval-2007 WePS evaluation: establishing a benchmark for the web people search task
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Proceedings of the 18th international conference on World wide web
Named entity disambiguation by leveraging wikipedia semantic knowledge
Proceedings of the 18th ACM conference on Information and knowledge management
Web personal name disambiguation based on reference entity tables mined from the web
Proceedings of the eleventh international workshop on Web information and data management
Learning similarity metrics for event identification in social media
Proceedings of the third ACM international conference on Web search and data mining
Dynamic hierarchical algorithms for document clustering
Pattern Recognition Letters
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Clustering web people search results using fuzzy ants
Information Sciences: an International Journal
Person name disambiguation by bootstrapping
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Structural semantic relatedness: a knowledge-based method to named entity disambiguation
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Graph-based clustering for computational linguistics: a survey
TextGraphs-5 Proceedings of the 2010 Workshop on Graph-based Methods for Natural Language Processing
NLP on spoken documents without ASR
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Organizing query completions for web search
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Uncovering transcriptional regulatory networks by sparse Bayesian factor model
EURASIP Journal on Advances in Signal Processing - Special issue on genomic signal processing
Instance sense induction from attribute sets
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Improving the dynamic hierarchical compact clustering algorithm by using feature selection
CIARP'10 Proceedings of the 15th Iberoamerican congress conference on Progress in pattern recognition, image analysis, computer vision, and applications
Automatic threshold estimation for data matching applications
Information Sciences: an International Journal
Narrowing the modeling gap: a cluster-ranking approach to coreference resolution
Journal of Artificial Intelligence Research
WikiTopics: what is popular on Wikipedia and why
WASDGML '11 Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages
A game theoretic framework for heterogenous information network clustering
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Managing performance vs. accuracy trade-offs with loop perforation
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Concurrent semi-supervised learning of data streams
DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
External evaluation measures for subspace clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
Weighted association based methods for the combination of heterogeneous partitions
Pattern Recognition Letters
Clustering geo-tagged photo collections using dynamic programming
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Similarity measures in formal concept analysis
Annals of Mathematics and Artificial Intelligence
Overcoming browser cookie churn with clustering
Proceedings of the fifth ACM international conference on Web search and data mining
Data Mining and Knowledge Discovery
Blanc: Implementing the rand index for coreference evaluation
Natural Language Engineering
A generative model for unsupervised discovery of relations and argument classes from clinical texts
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
HS-measure: a hybrid clustering validity measure to interpret road traffic data
Proceedings of the 5th International ICST Conference on Performance Evaluation Methodologies and Tools
The optimum clustering framework: implementing the cluster hypothesis
Information Retrieval
Journal of Artificial Intelligence Research
Clustering short text and its evaluation
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
Result disambiguation in web people search
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Making a scene: alignment of complete sets of clips based on pairwise audio match
Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
ReBucket: a method for clustering duplicate crash reports based on call stack similarity
Proceedings of the 34th International Conference on Software Engineering
Evaluation of clustering algorithms for word sense disambiguation
International Journal of Data Analysis Techniques and Strategies
Neighborhood-Based smoothing of external cluster validity measures
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Clustering and understanding documents via discrimination information maximization
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Extracting information networks from the blogosphere
ACM Transactions on the Web (TWEB)
Mining query subtopics from search log data
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
A constructive particle swarm algorithm for fuzzy clustering
IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
Unsupervised translation sense clustering
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
An evaluation of graded sense disambiguation using word sense induction
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Evaluating the use of clustering for automatically organising digital library collections
TPDL'12 Proceedings of the Second international conference on Theory and Practice of Digital Libraries
Clustering algorithm recommendation: a meta-learning approach
SEMCCO'12 Proceedings of the Third international conference on Swarm, Evolutionary, and Memetic Computing
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Automatic virtual machine clustering based on bhattacharyya distance for multi-cloud systems
Proceedings of the 2013 international workshop on Multi-cloud applications and federated clouds
An indication of unification for different clustering approaches
Pattern Recognition
MaxMax: a graph-based soft clustering algorithm applied to word sense induction
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
A new overlapping clustering algorithm based on graph theory
MICAI'12 Proceedings of the 11th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
A general evaluation measure for document organization tasks
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Axiometrics: An Axiomatic Approach to Information Retrieval Effectiveness Metrics
Proceedings of the 2013 Conference on the Theory of Information Retrieval
Dynamic query intent mining from a search log stream
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Sense induction in folksonomies: a review
Artificial Intelligence Review
Experiments in automated support for argument reconstruction
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Law
Semantic smoothing for text clustering
Knowledge-Based Systems
People reidentification in surveillance and forensics: A survey
ACM Computing Surveys (CSUR)
Energy-based function to evaluate data stream clustering
Advances in Data Analysis and Classification
Evaluating Word Sense Induction and Disambiguation Methods
Language Resources and Evaluation
Stock market co-movement assessment using a three-phase clustering method
Expert Systems with Applications: An International Journal
Machine Learning
Data stream dynamic clustering supported by Markov chain isomorphisms
Intelligent Data Analysis
Hi-index | 0.00 |
There is a wide set of evaluation metrics available to compare the quality of text clustering algorithms. In this article, we define a few intuitive formal constraints on such metrics which shed light on which aspects of the quality of a clustering are captured by different metric families. These formal constraints are validated in an experiment involving human assessments, and compared with other constraints proposed in the literature. Our analysis of a wide range of metrics shows that only BCubed satisfies all formal constraints. We also extend the analysis to the problem of overlapping clustering, where items can simultaneously belong to more than one cluster. As Bcubed cannot be directly applied to this task, we propose a modified version of Bcubed that avoids the problems found with other metrics.