Algorithms for clustering data
Algorithms for clustering data
Recent trends in hierarchic document clustering: a critical review
Information Processing and Management: an International Journal
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Bayesian classification (AutoClass): theory and results
Advances in knowledge discovery and data mining
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
WebACE: a Web agent for document categorization and exploration
AGENTS '98 Proceedings of the second international conference on Autonomous agents
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
On the merits of building categorization systems by supervised clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Partitioning-based clustering for Web document categorization
Decision Support Systems - Special issue on WITS '97
Document Categorization and Query Generation on the World Wide WebUsing WebACE
Artificial Intelligence Review - Special issue on data mining on the Internet
Agglomerative clustering of a search engine query log
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Concept decompositions for large sparse text data using clustering
Machine Learning
Co-clustering documents and words using bipartite spectral graph partitioning
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Bipartite graph partitioning and data clustering
Proceedings of the tenth international conference on Information and knowledge management
Information Retrieval
Evaluation of hierarchical clustering algorithms for document datasets
Proceedings of the eleventh international conference on Information and knowledge management
Principal Direction Divisive Partitioning
Data Mining and Knowledge Discovery
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Iterative Clustering of High Dimensional Text Data Augmented by Local Search
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
ROCK: A Robust Clustering Algorithm for Categorical Attributes
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Refining hierarchical taxonomy structure via semi-supervised learning
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Topic discovery based on text mining techniques
Information Processing and Management: an International Journal
Generating Concept Ontologies through Text Mining
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Using hierarchical clustering for learning theontologies used in recommendation systems
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Ontology evaluation using wikipedia categories for browsing
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Mining, indexing, and searching for textual chemical molecule information on the web
Proceedings of the 17th international conference on World Wide Web
Spectral geometry for simultaneously clustering and ranking query search results
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Improving density-based methods for hierarchical clustering of web pages
Data & Knowledge Engineering
Document Clustering Using Incremental and Pairwise Approaches
Focused Access to XML Documents
Boosting the ranking function learning process using clustering
Proceedings of the 10th ACM workshop on Web information and data management
An active learning framework for semi-supervised document clustering with language modeling
Data & Knowledge Engineering
KES '07 Knowledge-Based Intelligent Information and Engineering Systems and the XVII Italian Workshop on Neural Networks on Proceedings of the 11th International Conference
Hybridization of K-Means and Harmony Search Methods for Web Page Clustering
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Semantic Patent Clustering for Biomedical Communities
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
External validation measures for K-means clustering: A data distribution perspective
Expert Systems with Applications: An International Journal
Data Mining and Knowledge Discovery
Harmony K-means algorithm for document clustering
Data Mining and Knowledge Discovery
Ricochet: A Family of Unconstrained Algorithms for Graph Clustering
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
A methodology for extracting temporal properties from sensor network data streams
Proceedings of the 7th international conference on Mobile systems, applications, and services
Multilingual word sense discrimination: a comparative cross-linguistic study
ACL '07 Proceedings of the Workshop on Balto-Slavonic Natural Language Processing: Information Extraction and Enabling Technologies
Requirements-oriented methodology for evaluating ontologies
Information Systems
Requirements-oriented methodology for evaluating ontologies
Information Systems
Two graph-based algorithms for state-of-the-art WSD
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Sequential Hierarchical Pattern Clustering
PRIB '09 Proceedings of the 4th IAPR International Conference on Pattern Recognition in Bioinformatics
Semeval-2007 task 02: evaluating word sense induction and discrimination systems
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
SemEval-2010 task 14: evaluation setting for word sense induction & disambiguation systems
DEW '09 Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions
Integrating knowledge flow mining and collaborative filtering to support document recommendation
Journal of Systems and Software
Interpretable and reconfigurable clustering of document datasets by deriving word-based rules
Proceedings of the 18th ACM conference on Information and knowledge management
SPARCL: an effective and efficient algorithm for mining arbitrary shape-based clusters
Knowledge and Information Systems
Automatically generating Wikipedia articles: a structure-aware approach
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Dynamic hierarchical algorithms for document clustering
Pattern Recognition Letters
An empirical study of data smoothing methods for memory-based and hybrid collaborative filtering
PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Hierarchical document clustering using local patterns
Data Mining and Knowledge Discovery
Prototype hierarchy based clustering for the categorization and navigation of web collections
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A method for discovering components of human rituals from streams of sensor data
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Clustering polish texts with latent semantic analysis
ICAISC'10 Proceedings of the 10th international conference on Artifical intelligence and soft computing: Part II
The impact of unlinkability on adversarial community detection: effects and countermeasures
PETS'10 Proceedings of the 10th international conference on Privacy enhancing technologies
Spatial statistics of visual keypoints for texture recognition
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Studying the factors influencing automatic user task detection on the computer desktop
EC-TEL'10 Proceedings of the 5th European conference on Technology enhanced learning conference on Sustaining TEL: from innovation to learning and practice
Using semantic techniques to access web data
Information Systems
Citation recommendation without author supervision
Proceedings of the fourth ACM international conference on Web search and data mining
Document clustering using synthetic cluster prototypes
Data & Knowledge Engineering
An evaluation framework for plagiarism detection
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Topic-constrained hierarchical clustering for document datasets
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Neurocomputing
Clust-XPaths: clustering of XML paths
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Exploiting rating behaviors for effective collaborative filtering
WISE'06 Proceedings of the 7th international conference on Web Information Systems
Agglomerative hierarchical clustering with constraints: theoretical and empirical results
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
A quality driven Hierarchical Data Divisive Soft Clustering for information retrieval
Knowledge-Based Systems
Text clustering with limited user feedback under local metric learning
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
DHCC: Divisive hierarchical clustering of categorical data
Data Mining and Knowledge Discovery
Document hierarchies from text and links
Proceedings of the 21st international conference on World Wide Web
PROBABILISTIC HEURISTICS FOR HIERARCHICAL WEB DATA CLUSTERING
Computational Intelligence
Leveraging network structure for incremental document clustering
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Evaluation of clustering algorithms for word sense disambiguation
International Journal of Data Analysis Techniques and Strategies
A coherence model based on syntactic patterns
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Document-topic hierarchies from document graphs
Proceedings of the 21st ACM international conference on Information and knowledge management
Measuring the coverage and redundancy of information search services on e-commerce platforms
Electronic Commerce Research and Applications
Hierarchical data organization for effective retrieval of similar shaders
Proceedings of the 2012 ACM Research in Applied Computation Symposium
On the use of consensus clustering for incremental learning of topic hierarchies
SBIA'12 Proceedings of the 21st Brazilian conference on Advances in Artificial Intelligence
Data Field for Hierarchical Clustering
International Journal of Data Warehousing and Mining
A comparative study of dimensionality reduction techniques to enhance trace clustering performances
Expert Systems with Applications: An International Journal
QUEST: discovering insights from survey responses
AusDM '09 Proceedings of the Eighth Australasian Data Mining Conference - Volume 101
ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part I
An ensemble clustering model for mining concept drifting stream data in emergency management
DM-IKM '12 Proceedings of the Data Mining and Intelligent Knowledge Management Workshop
Understanding SMS spam in a large cellular network
Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems
A sample-based hierarchical adaptive K-means clustering method for large-scale video retrieval
Knowledge-Based Systems
Efficient hierarchical clustering of large high dimensional datasets
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Efficient web service discovery using hierarchical clustering
AT'13 Proceedings of the Second international conference on Agreement Technologies
Predicting students' final performance from participation in on-line discussion forums
Computers & Education
Semantic smoothing for text clustering
Knowledge-Based Systems
Ontological semantic inference based on cognitive map
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Fast and high-quality document clustering algorithms play an important role in providing intuitive navigation and browsing mechanisms by organizing large amounts of information into a small number of meaningful clusters. In particular, clustering algorithms that build meaningful hierarchies out of large document collections are ideal tools for their interactive visualization and exploration as they provide data-views that are consistent, predictable, and at different levels of granularity. This paper focuses on document clustering algorithms that build such hierarchical solutions and (i) presents a comprehensive study of partitional and agglomerative algorithms that use different criterion functions and merging schemes, and (ii) presents a new class of clustering algorithms called constrained agglomerative algorithms, which combine features from both partitional and agglomerative approaches that allows them to reduce the early-stage errors made by agglomerative methods and hence improve the quality of clustering solutions. The experimental evaluation shows that, contrary to the common belief, partitional algorithms always lead to better solutions than agglomerative algorithms; making them ideal for clustering large document collections due to not only their relatively low computational requirements, but also higher clustering quality. Furthermore, the constrained agglomerative methods consistently lead to better solutions than agglomerative methods alone and for many cases they outperform partitional methods, as well.