An algorithm for suffix stripping
Readings in information retrieval
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Inferring hierarchical descriptions
Proceedings of the eleventh international conference on Information and knowledge management
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Term Weighting Approaches in Automatic Text Retrieval
Term Weighting Approaches in Automatic Text Retrieval
Text Mining: Predictive Methods for Analyzing Unstructured Information
Text Mining: Predictive Methods for Analyzing Unstructured Information
Automatically labeling hierarchical clusters
dg.o '06 Proceedings of the 2006 international conference on Digital government research
Visual text mining using association rules
Computers and Graphics
Discovering a term taxonomy from term similarities using principal component analysis
EWMF'05/KDO'05 Proceedings of the 2005 joint international conference on Semantics, Web and Mining
Building a topic hierarchy using the bag-of-related-words representation
Proceedings of the 11th ACM symposium on Document engineering
Improving hierarchical document cluster labels through candidate term selection
Intelligent Decision Technologies
Hi-index | 0.00 |
One way to organize knowledge and make its search and retrieval easier is to create a structural representation divided by hierarchically related topics. Once this structure is built, it is necessary to find labels for each of the obtained clusters. In many cases the labels have to be built using only the terms in the documents of the collection. This paper presents the SeCLAR (Selecting Candidate Labels using Association Rules) method, which explores the use of association rules for the selection of good candidates for labels of hierarchical document clusters. The candidates are processed by a classical method to generate the labels. The idea of the proposed method is to process each parent-child relationship of the nodes as an antecedent-consequent relationship of association rules. The experimental results show that the proposed method can improve the precision and recall of labels obtained by classical methods.