Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
WebACE: a Web agent for document categorization and exploration
AGENTS '98 Proceedings of the second international conference on Autonomous agents
Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Algorithms for association rule mining — a general survey and comparison
ACM SIGKDD Explorations Newsletter
Classifying text documents by associating terms with text categories
ADC '02 Proceedings of the 13th Australasian database conference - Volume 5
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Data mining for hypertext: a tutorial survey
ACM SIGKDD Explorations Newsletter
Segmenting Customer Transactions Using a Pattern-Based Clustering Approach
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Extracting unstructured data from template generated web documents
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
COFI approach for mining frequent itemsets revisited
Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Efficient Phrase-Based Document Indexing for Web Document Clustering
IEEE Transactions on Knowledge and Data Engineering
Iterative Projected Clustering by Subspace Mining
IEEE Transactions on Knowledge and Data Engineering
A divide-and-merge methodology for clustering
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
DM-AMS: employing data mining techniques for alert management
dg.o '05 Proceedings of the 2005 national conference on Digital government research
A sampling-based framework for parallel data mining
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Mining condensed frequent-pattern bases
Knowledge and Information Systems
Adaptive topological tree structure for document organisation and visualisation
Neural Networks - 2004 Special issue: New developments in self-organizing systems
A partitioning based algorithm to fuzzy co-cluster documents and words
Pattern Recognition Letters
Implementing leap traversals of the itemset lattice
Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations
A comprehensive comparison study of document clustering for a biomedical digital library MEDLINE
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Pragmatic text mining: minimizing human effort to quantify many issues in call logs
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
A divide-and-merge methodology for clustering
ACM Transactions on Database Systems (TODS)
Discover the semantic topology in high-dimensional data
Expert Systems with Applications: An International Journal
Discovery of maximum length frequent itemsets
Information Sciences: an International Journal
Text document clustering based on frequent word meaning sequences
Data & Knowledge Engineering
Biomedical ontology improves biomedical literature clustering performance: a comparison study
International Journal of Bioinformatics Research and Applications
Query-sets: using implicit feedback and query patterns to organize web documents
Proceedings of the 17th international conference on World Wide Web
A Novel Web Page Analysis Method for Efficient Reasoning of User Preference
APCHI '08 Proceedings of the 8th Asia-Pacific conference on Computer-Human Interaction
A New Document Clustering Algorithm for Topic Discovering and Labeling
CIARP '08 Proceedings of the 13th Iberoamerican congress on Pattern Recognition: Progress in Pattern Recognition, Image Analysis and Applications
Hierarchical Star Clustering Algorithm for Dynamic Document Collections
CIARP '08 Proceedings of the 13th Iberoamerican congress on Pattern Recognition: Progress in Pattern Recognition, Image Analysis and Applications
Clustering high dimensional data: A graph-based relaxed optimization approach
Information Sciences: an International Journal
Context-Based Text Mining for Insights in Long Documents
PAKM '08 Proceedings of the 7th International Conference on Practical Aspects of Knowledge Management
An Integration of Fuzzy Association Rules and WordNet for Document Clustering
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
RSFDGrC '07 Proceedings of the 11th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
A Semi-supervised Topic-Driven Approach for Clustering Textual Answers to Survey Questions
ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Interpretable and reconfigurable clustering of document datasets by deriving word-based rules
Proceedings of the 18th ACM conference on Information and knowledge management
A simplicial complex, a hypergraph, structure in the latent semantic space of document clustering
International Journal of Approximate Reasoning
Text clustering approach based on maximal frequent term sets
SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Mining fuzzy frequent itemsets for hierarchical document clustering
Information Processing and Management: an International Journal
Dynamic hierarchical algorithms for document clustering
Pattern Recognition Letters
Two-party privacy-preserving agglomerative document clustering
ISPEC'07 Proceedings of the 3rd international conference on Information security practice and experience
Clustering zebrafish genes based on frequent-itemsets and frequency levels
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Frequent variable sets based clustering for artificial neural networks particle classification
APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Text onto miner: a semi automated ontology building system
ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
Semantics-guided clustering of heterogeneous XML schemas
Journal on data semantics IX
Text clustering using frequent itemsets
Knowledge-Based Systems
Hierarchical document clustering using local patterns
Data Mining and Knowledge Discovery
Evolutionary clustering using frequent itemsets
Proceedings of the First International Workshop on Novel Data Stream Pattern Mining Techniques
A topological embedding of the lexicon for semantic distance computation
Natural Language Engineering
Validation of overlapping clustering: A random clustering perspective
Information Sciences: an International Journal
Data & Knowledge Engineering
Frequent itemset based hierarchical document clustering using Wikipedia as external knowledge
KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part II
A smarter process for sensing the information space
IBM Journal of Research and Development
A comparison of unsupervised learning algorithms for gesture clustering
Proceedings of the 6th international conference on Human-robot interaction
Hierarchical comments-based clustering
Proceedings of the 2011 ACM Symposium on Applied Computing
SciSumm: a multi-document summarization system for scientific articles
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Systems Demonstrations
WASDGML '11 Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages
Clustering for semi-supervised spam filtering
Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference
An efficient algorithm for topic ranking and modeling topic evolution
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
Improving document clustering using Okapi BM25 feature weighting
Information Retrieval
Fast mining erasable itemsets using NC_sets
Expert Systems with Applications: An International Journal
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
An efficient user-oriented clustering of web search results
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part III
An approach for clustering semantically heterogeneous XML schemas
OTM'05 Proceedings of the 2005 Confederated international conference on On the Move to Meaningful Internet Systems - Volume >Part I
Parallel mining of top-k frequent itemsets in very large text database
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Incremental clustering of newsgroup articles
IEA/AIE'06 Proceedings of the 19th international conference on Advances in Applied Artificial Intelligence: industrial, Engineering and Other Applications of Applied Intelligent Systems
Web image clustering with reduced keywords and weighted bipartite spectral graph partitioning
PCM'06 Proceedings of the 7th Pacific Rim conference on Advances in Multimedia Information Processing
Succinct and informative cluster descriptions for document repositories
WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Relevance of counting in data mining tasks
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Term graph model for text classification
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Improving retrievability with improved cluster-based pseudo-relevance feedback selection
Expert Systems with Applications: An International Journal
Short documents clustering in very large text databases
WISE'06 Proceedings of the 7th international conference on Web Information Systems
Improving suffix tree clustering with new ranking and similarity measures
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
Mining same-taste users with common preference patterns for ubiquitous exhibition navigation
ACIIDS'12 Proceedings of the 4th Asian conference on Intelligent Information and Database Systems - Volume Part III
Selecting labels for news document clusters
NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
Effective measures for inter-document similarity
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Accelerating frequent item counting with FPGA
Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays
Clustering Software Components for Component Reuse and Program Restructuring
Proceedings of the Second International Conference on Innovative Computing and Cloud Computing
Summarization of scientific documents by detecting common facts in citations
Future Generation Computer Systems
CALA: An unsupervised URL-based web page classification system
Knowledge-Based Systems
Enhanced cross-domain document clustering with a semantically enhanced text stemmer SETS
International Journal of Knowledge-based and Intelligent Engineering Systems - Selected papers of KES2012-Part 2 of 2
Hi-index | 0.00 |
Text clustering methods can be used to structure large sets of text or hypertext documents. The well-known methods of text clustering, however, do not really address the special problems of text clustering: very high dimensionality of the data, very large size of the databases and understandability of the cluster description. In this paper, we introduce a novel approach which uses frequent item (term) sets for text clustering. Such frequent sets can be efficiently discovered using algorithms for association rule mining. To cluster based on frequent term sets, we measure the mutual overlap of frequent sets with respect to the sets of supporting documents. We present two algorithms for frequent term-based text clustering, FTC which creates flat clusterings and HFTC for hierarchical clustering. An experimental evaluation on classical text documents as well as on web documents demonstrates that the proposed algorithms obtain clusterings of comparable quality significantly more efficiently than state-of-the- art text clustering algorithms. Furthermore, our methods provide an understandable description of the discovered clusters by their frequent term sets.