Algorithms for clustering data
Algorithms for clustering data
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Bayesian classification (AutoClass): theory and results
Advances in knowledge discovery and data mining
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
WebACE: a Web agent for document categorization and exploration
AGENTS '98 Proceedings of the second international conference on Autonomous agents
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
On the merits of building categorization systems by supervised clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Partitioning-based clustering for Web document categorization
Decision Support Systems - Special issue on WITS '97
Document Categorization and Query Generation on the World Wide WebUsing WebACE
Artificial Intelligence Review - Special issue on data mining on the Internet
Principal Direction Divisive Partitioning
Data Mining and Knowledge Discovery
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
ROCK: A Robust Clustering Algorithm for Categorical Attributes
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Misuse detection for information retrieval systems
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Proceedings of the 13th international conference on World Wide Web
Panorama: extending digital libraries with topical crawlers
Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
A partial join approach for mining co-location patterns
Proceedings of the 12th annual ACM international workshop on Geographic information systems
Hierarchical Clustering Algorithms for Document Datasets
Data Mining and Knowledge Discovery
A divide-and-merge methodology for clustering
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A general model for clustering binary data
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
A hybrid unsupervised approach for document clustering
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Clustering high-dimensional data using an efficient and effective data space reduction
Proceedings of the 14th ACM international conference on Information and knowledge management
A characterization of data mining algorithms on a modern processor
DaMoN '05 Proceedings of the 1st international workshop on Data management on new hardware
Maxdiff kd-trees for data condensation
Pattern Recognition Letters
A comprehensive comparison study of document clustering for a biomedical digital library MEDLINE
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Maximum likelihood combination of multiple clusterings
Pattern Recognition Letters
Effective document clustering for large heterogeneous law firm collections
ICAIL '05 Proceedings of the 10th international conference on Artificial intelligence and law
Clustering quality measures for data samples with multiple labels
DBA'06 Proceedings of the 24th IASTED international conference on Database and applications
Incremental hierarchical clustering of text documents
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
A divide-and-merge methodology for clustering
ACM Transactions on Database Systems (TODS)
Answer extraction, semantic clustering, and extractive summarization for clinical question answering
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Language model-based document clustering using random walks
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
XML schema clustering with semantic and hierarchical similarity measures
Knowledge-Based Systems
Intelligent Data Analysis
Exploiting parallelism to support scalable hierarchical clustering
Journal of the American Society for Information Science and Technology
Spectral clustering by recursive partitioning
ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
Clustering support for automated tracing
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Content free clustering for search engine query log
SMO'07 Proceedings of the 7th WSEAS International Conference on Simulation, Modelling and Optimization
A heuristic algorithm for clustering rooted ordered trees
Intelligent Data Analysis
Leveraging user query log: toward improving image data clustering
CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Proceedings of the 25th international conference on Machine learning
Hypergraph partitioning for document clustering: a unified clique perspective
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Identifying domain expertise of developers from source code
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Improving density-based methods for hierarchical clustering of web pages
Data & Knowledge Engineering
Comparing Non-parametric Ensemble Methods for Document Clustering
NLDB '08 Proceedings of the 13th international conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems
Labeling Nodes of Automatically Generated Taxonomy for Multi-type Relational Datasets
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Constrained locally weighted clustering
Proceedings of the VLDB Endowment
Personalized cluster-based semantically enriched web search for e-learning
Proceedings of the 2nd international workshop on Ontologies and information systems for the semantic web
A new method for hierarchical clustering combination
Intelligent Data Analysis
A schema matching-based approach to XML schema clustering
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Recovery Rate of Clustering Algorithms
PSIVT '09 Proceedings of the 3rd Pacific Rim Symposium on Advances in Image and Video Technology
A recommender system for requirements elicitation in large-scale software projects
Proceedings of the 2009 ACM symposium on Applied Computing
Short Text Clustering for Search Results
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Enhancing Document Clustering through Heuristics and Summary-Based Pre-processing
Proceedings of the Symposium on Human Interface 2009 on Human Interface and the Management of Information. Information and Interaction. Part II: Held as part of HCI International 2009
Exploiting Domain Knowledge by Automated Taxonomy Generation in Recommender Systems
EC-Web 2009 Proceedings of the 10th International Conference on E-Commerce and Web Technologies
Vector-Based Unsupervised Word Sense Disambiguation for Large Number of Contexts
TSD '09 Proceedings of the 12th International Conference on Text, Speech and Dialogue
Relaxed Transfer of Different Classes via Spectral Partition
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Multi-model Ontology-Based Hybrid Recommender System in E-learning Domain
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Semantic clustering of XML documents
ACM Transactions on Information Systems (TOIS)
A Speed-Up Hierarchical Compact Clustering Algorithm for Dynamic Document Collections
CIARP '09 Proceedings of the 14th Iberoamerican Conference on Pattern Recognition: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Dynamic hierarchical algorithms for document clustering
Pattern Recognition Letters
Clustering dense graphs: A web site graph paradigm
Information Processing and Management: an International Journal
Automatic index construction for multimedia digital libraries
Information Processing and Management: an International Journal
Creating personal histories from the web using namesake disambiguation and event extraction
ICWE'07 Proceedings of the 7th international conference on Web engineering
A novel hierarchical-clustering-combination scheme based on fuzzy-similarity relations
IEEE Transactions on Fuzzy Systems
Hierarchical co-clustering for web queries and selected URLs
WISE'07 Proceedings of the 8th international conference on Web information systems engineering
Partitional clustering experiments with news documents
CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Building geospatial data collections with location-based games
KI'09 Proceedings of the 32nd annual German conference on Advances in artificial intelligence
Biomedical question answering: A survey
Computer Methods and Programs in Biomedicine
Evolutionary clustering using frequent itemsets
Proceedings of the First International Workshop on Novel Data Stream Pattern Mining Techniques
Proceedings of the IEEE/ACM international conference on Automated software engineering
Graph grammar representation for collaborative sample-based music creation
Proceedings of the 5th Audio Mostly Conference: A Conference on Interaction with Sound
From frequency to meaning: vector space models of semantics
Journal of Artificial Intelligence Research
Using text mining techniques in electronic data interchange environment
WSEAS Transactions on Computers
Maximum normalized spacing for efficient visual clustering
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
ANITA: a narrative interpretation of taxonomies for their adaptation to text collections
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Frequent itemset based hierarchical document clustering using Wikipedia as external knowledge
KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part II
Improving the dynamic hierarchical compact clustering algorithm by using feature selection
CIARP'10 Proceedings of the 15th Iberoamerican congress conference on Progress in pattern recognition, image analysis, computer vision, and applications
XML data clustering: An overview
ACM Computing Surveys (CSUR)
JACKSTRAWS: picking command and control connections from bot traffic
SEC'11 Proceedings of the 20th USENIX conference on Security
Clustering for semi-supervised spam filtering
Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference
Building a topic hierarchy using the bag-of-related-words representation
Proceedings of the 11th ACM symposium on Document engineering
Improving document clustering using Okapi BM25 feature weighting
Information Retrieval
A novel hierarchical document clustering algorithm based on a kNN connection graph
ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
COWES: clustering web users based on historical web sessions
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
XCLS: a fast and effective clustering algorithm for heterogenous XML documents
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Dynamic hierarchical compact clustering algorithm
CIARP'05 Proceedings of the 10th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis and Applications
XML documents clustering by structures
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Editorial: Narrative-based taxonomy distillation for effective indexing of text collections
Data & Knowledge Engineering
XMine: a methodology for mining XML structure
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
A comparative study on representing units in chinese text clustering
KSEM'06 Proceedings of the First international conference on Knowledge Science, Engineering and Management
Name discrimination by clustering similar contexts
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Exploiting probabilistic latent information for the construction of community web directories
UM'05 Proceedings of the 10th international conference on User Modeling
Dynamic pattern mining: an incremental data clustering approach
Journal on Data Semantics II
Topic structure mining for document sets using graph-based analysis
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Topic structure mining using pagerank without hyperlinks
ICADL'06 Proceedings of the 9th international conference on Asian Digital Libraries: achievements, Challenges and Opportunities
Adaptive term weighting through stochastic optimization
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Web traffic profiling and characterization
Proceedings of the Seventh Annual Workshop on Cyber Security and Information Intelligence Research
Discovering collective viewpoints on micro-blogging events based on community and temporal aspects
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Characterization and exploitation of community structure in cover song networks
Pattern Recognition Letters
PROBABILISTIC HEURISTICS FOR HIERARCHICAL WEB DATA CLUSTERING
Computational Intelligence
A fast and effective partitioning algorithm for document clustering
ICDEM'10 Proceedings of the Second international conference on Data Engineering and Management
Collective viewpoint identification of low-level participation
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Predicting web user behavior using learning-based ant colony optimization
Engineering Applications of Artificial Intelligence
Query log analysis with GALATEAS LangLog
EACL '12 Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics
Approximate clone detection in repositories of business process models
BPM'12 Proceedings of the 10th international conference on Business Process Management
An innovative way for mining clinical and administrative healthcare data
AMT'12 Proceedings of the 8th international conference on Active Media Technology
Structure inference for linked data sources using clustering
Proceedings of the Joint EDBT/ICDT 2013 Workshops
A hierarchical clusterer ensemble method based on boosting theory
Knowledge-Based Systems
Information-theoretic term weighting schemes for document clustering
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Comparative study of text clustering techniques in virtual worlds
Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics
Modeling and predicting the task-by-task behavior of search engine users
Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
Discovering tasks from search engine query logs
ACM Transactions on Information Systems (TOIS)
Semantic smoothing for text clustering
Knowledge-Based Systems
Learning a taxonomy of predefined and discovered activity patterns
Journal of Ambient Intelligence and Smart Environments
Hi-index | 0.00 |
Fast and high-quality document clustering algorithms play an important role in providing intuitive navigation and browsing mechanisms by organizing large amounts of information into a small number of meaningful clusters. In particular, hierarchical clustering solutions provide a view of the data at different levels of granularity, making them ideal for people to visualize and interactively explore large document collections.In this paper we evaluate different partitional and agglomerative approaches for hierarchical clustering. Our experimental evaluation showed that partitional algorithms always lead to better clustering solutions than agglomerative algorithms, which suggests that partitional clustering algorithms are well-suited for clustering large document datasets due to not only their relatively low computational requirements, but also comparable or even better clustering performance. We present a new class of clustering algorithms called constrained agglomerative algorithms that combine the features of both partitional and agglomerative algorithms. Our experimental results showed that they consistently lead to better hierarchical solutions than agglomerative or partitional algorithms alone.