Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
OHSUMED: an interactive retrieval evaluation and new large test collection for research
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Bayesian classification (AutoClass): theory and results
Advances in knowledge discovery and data mining
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
WebACE: a Web agent for document categorization and exploration
AGENTS '98 Proceedings of the second international conference on Autonomous agents
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
ACM Computing Surveys (CSUR)
Agglomerative clustering of a search engine query log
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Partitioning Rectangular and Structurally Unsymmetric Sparse Matrices for Parallel Processing
SIAM Journal on Scientific Computing
Co-clustering documents and words using bipartite spectral graph partitioning
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Bipartite graph partitioning and data clustering
Proceedings of the tenth international conference on Information and knowledge management
Principal Direction Divisive Partitioning
Data Mining and Knowledge Discovery
An Experimental Comparison of Model-Based Clustering Methods
Machine Learning
Concept Decompositions for Large Sparse Text Data Using Clustering
Machine Learning
Knowledge Acquisition Via Incremental Conceptual Clustering
Machine Learning
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
ROCK: A Robust Clustering Algorithm for Categorical Attributes
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Iterative optimization and simplification of hierarchical clusterings
Journal of Artificial Intelligence Research
Soft clustering criterion functions for partitional document clustering: a summary of results
Proceedings of the thirteenth ACM international conference on Information and knowledge management
A Two-Stage Linear Discriminant Analysis via QR-Decomposition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hierarchical Clustering Algorithms for Document Datasets
Data Mining and Knowledge Discovery
IDR/QR: An Incremental Dimension Reduction Algorithm via QR Decomposition
IEEE Transactions on Knowledge and Data Engineering
A hybrid unsupervised approach for document clustering
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Orthogonal nonnegative matrix t-factorizations for clustering
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
K-means clustering versus validation measures: a data distribution perspective
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Effective document clustering for large heterogeneous law firm collections
ICAIL '05 Proceedings of the 10th international conference on Artificial intelligence and law
Clustering quality measures for data samples with multiple labels
DBA'06 Proceedings of the 24th IASTED international conference on Database and applications
Incremental hierarchical clustering of text documents
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Parallel bisecting k-means with prediction clustering algorithm
The Journal of Supercomputing
Meaningful clustering of senses helps boost word sense disambiguation performance
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Topic discovery based on text mining techniques
Information Processing and Management: an International Journal
Text document clustering based on frequent word meaning sequences
Data & Knowledge Engineering
International Journal of Business Intelligence and Data Mining
Computational Statistics & Data Analysis
Statistical properties of community structure in large social and information networks
Proceedings of the 17th international conference on World Wide Web
Spectral geometry for simultaneously clustering and ranking query search results
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
SAIL: summation-based incremental learning for information-theoretic clustering
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Identifying domain expertise of developers from source code
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Improving density-based methods for hierarchical clustering of web pages
Data & Knowledge Engineering
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
An Analysis of Bloggers, Topics and Tags for a Blog Recommender System
From Web to Social Web: Discovering and Deploying User and Content Profiles
Document Clustering Using Incremental and Pairwise Approaches
Focused Access to XML Documents
Comparing Non-parametric Ensemble Methods for Document Clustering
NLDB '08 Proceedings of the 13th international conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems
Selecting the Right Features for Bipartite-Based Text Clustering
ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Clustering Via Local Regression
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Augmenting Word Space Models for Word Sense Discrimination Using an Automatic Thesaurus
GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Boosting the ranking function learning process using clustering
Proceedings of the 10th ACM workshop on Web information and data management
Modeling Collaborations Content in Social Network Analysis
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Hybridization of K-Means and Harmony Search Methods for Web Page Clustering
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Expert Systems with Applications: An International Journal
Clustering of document collection - A weighting approach
Expert Systems with Applications: An International Journal
Harmony K-means algorithm for document clustering
Data Mining and Knowledge Discovery
Regularized Local Reconstruction for Clustering
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Adapting the right measures for K-means clustering
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Web Page Clustering via Partition Adaptive Affinity Propagation
ISNN 2009 Proceedings of the 6th International Symposium on Neural Networks: Advances in Neural Networks - Part II
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Frequent Itemset Mining for Clustering Near Duplicate Web Documents
ICCS '09 Proceedings of the 17th International Conference on Conceptual Structures: Conceptual Structures: Leveraging Semantic Technologies
Performance evaluation of density-based clustering methods
Information Sciences: an International Journal
An analysis of the use of tags in a blog recommender system
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Text document clustering based on neighbors
Data & Knowledge Engineering
Comparison of similarity models for the relation discovery task
LD '06 Proceedings of the Workshop on Linguistic Distances
Music clustering with features from different information sources
IEEE Transactions on Multimedia - Special section on communities and media computing
K-means clustering versus validation measures: a data-distribution perspective
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Semantic clustering of XML documents
ACM Transactions on Information Systems (TOIS)
Particle Swarm Optimization for clustering short-text corpora
Proceedings of the 2009 conference on Computational Intelligence and Bioengineering: Essays in Memory of Antonina Starita
Comparison of similarity measures for clustering Turkish documents
Intelligent Data Analysis
Clustering dense graphs: A web site graph paradigm
Information Processing and Management: an International Journal
Fast Approximate kNN Graph Construction for High Dimensional Data via Recursive Lanczos Bisection
The Journal of Machine Learning Research
Designing a methodology to estimate complexity of protein structures
ECAL'07 Proceedings of the 9th European conference on Advances in artificial life
Nonnegative Matrix Factorization on Orthogonal Subspace
Pattern Recognition Letters
Term weighting evaluation in bipartite partitioning for text clustering
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
On determining the optimal partition in agglomerative clustering of documents
ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
Word clustering with validity indices
Canadian AI'08 Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on Advances in artificial intelligence
A fast divisive clustering algorithm using an improved discrete particle swarm optimizer
Pattern Recognition Letters
Clustering with feature order preferences
Intelligent Data Analysis - Artificial Intelligence
Validation of overlapping clustering: A random clustering perspective
Information Sciences: an International Journal
Text mining documents in electronic data interchange environment
NN'10/EC'10/FS'10 Proceedings of the 11th WSEAS international conference on nural networks and 11th WSEAS international conference on evolutionary computing and 11th WSEAS international conference on Fuzzy systems
Using text mining techniques in electronic data interchange environment
WSEAS Transactions on Computers
Multilevel manifold learning with application to spectral clustering
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Projection based clustering of gene expression data
CIBB'09 Proceedings of the 6th international conference on Computational intelligence methods for bioinformatics and biostatistics
W-kmeans: clustering news articles using wordNet
KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part III
Scaling up top-K cosine similarity search
Data & Knowledge Engineering
A two-view learning approach for image tag ranking
Proceedings of the fourth ACM international conference on Web search and data mining
Topic-constrained hierarchical clustering for document datasets
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Semantic search results clustering
ICCCI'10 Proceedings of the Second international conference on Computational collective intelligence: technologies and applications - Volume PartI
An effective evaluation measure for clustering on evolving data streams
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Collaborative clustering of XML documents
Journal of Computer and System Sciences
Clustering for semi-supervised spam filtering
Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference
Clust-XPaths: clustering of XML paths
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Improving document clustering using Okapi BM25 feature weighting
Information Retrieval
Finding the optimal number of clusters for word sense disambiguation
TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Search behavior-driven training for result re-ranking
TPDL'11 Proceedings of the 15th international conference on Theory and practice of digital libraries: research and advanced technology for digital libraries
A statistical model for topically segmented documents
DS'11 Proceedings of the 14th international conference on Discovery science
RSKT'11 Proceedings of the 6th international conference on Rough sets and knowledge technology
Distance metrics for high dimensional nearest neighborhood recovery: Compression and normalization
Information Sciences: an International Journal
Robust discriminant analysis of latent semantic feature for text categorization
FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
A comparative study on representing units in chinese text clustering
KSEM'06 Proceedings of the First international conference on Knowledge Science, Engineering and Management
Comparing dimension reduction techniques for document clustering
AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence
Skin lesions characterisation utilising clustering algorithms
SETN'10 Proceedings of the 6th Hellenic conference on Artificial Intelligence: theories, models and applications
Keyframe retrieval by keypoints: can point-to-point matching help?
CIVR'06 Proceedings of the 5th international conference on Image and Video Retrieval
Reorganizing clouds: A study on tag clustering and evaluation
Expert Systems with Applications: An International Journal
Topic discovery and topic-driven clustering for audit method datasets
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
XML document clustering using structure-preserving flat representation of XML content and structure
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
A fast and effective partitioning algorithm for document clustering
ICDEM'10 Proceedings of the Second international conference on Data Engineering and Management
Clustering high dimensional data
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Evaluation of clustering algorithms for word sense disambiguation
International Journal of Data Analysis Techniques and Strategies
LinkFCM: Relation integrated fuzzy c-means
Pattern Recognition
A clustering technique for news articles using WordNet
Knowledge-Based Systems
Efficient stochastic algorithms for document clustering
Information Sciences: an International Journal
Measuring the coverage and redundancy of information search services on e-commerce platforms
Electronic Commerce Research and Applications
Comparative study of text clustering techniques in virtual worlds
Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics
Comparing relational and non-relational algorithms for clustering propositional data
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Modeling and predicting the task-by-task behavior of search engine users
Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
Discovering tasks from search engine query logs
ACM Transactions on Information Systems (TOIS)
Effective measures for inter-document similarity
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Unsupervised tagging of spanish lyrics dataset using clustering
MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Text Document Clustering with Hybrid Feature Selection
Proceedings of International Conference on Information Integration and Web-based Applications & Services
An efficient Particle Swarm Optimization approach to cluster short texts
Information Sciences: an International Journal
Scaling up cosine interesting pattern discovery: A depth-first method
Information Sciences: an International Journal
Hi-index | 0.01 |
This paper evaluates the performance of different criterion functions in the context of partitional clustering algorithms for document datasets. Our study involves a total of seven different criterion functions, three of which are introduced in this paper and four that have been proposed in the past. We present a comprehensive experimental evaluation involving 15 different datasets, as well as an analysis of the characteristics of the various criterion functions and their effect on the clusters they produce. Our experimental results show that there are a set of criterion functions that consistently outperform the rest, and that some of the newly proposed criterion functions lead to the best overall results. Our theoretical analysis shows that the relative performance of the criterion functions depends on (i) the degree to which they can correctly operate when the clusters are of different tightness, and (ii) the degree to which they can lead to reasonably balanced clusters.