Topic Discovery from Text Using Aggregation of Different Clustering Methods

Authors:
Hanan Ayad;Mohamed S. Kamel
Affiliations:
-;-
Venue:
AI '02 Proceedings of the 15th Conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Year:
2002

Citing 10
Cited 5

Algorithms for clustering data

Algorithms for clustering data
A probabilistic approach to clustering

Pattern Recognition Letters
Computer-assisted reasoning in cluster analysis

Computer-assisted reasoning in cluster analysis
From user access patterns to dynamic hypertext linking

Proceedings of the fifth international World Wide Web conference on Computer networks and ISDN systems
Concept Learning and Feature Selection Based on Square-Error Clustering

Machine Learning
Fast and effective text mining using linear-time document clustering

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Matrices, Vector Spaces, and Information Retrieval

SIAM Review
Towards adaptive Web sites: conceptual framework and case study

Artificial Intelligence - Special issue on Intelligent internet systems
Clustering Algorithms

Clustering Algorithms
Refining Initial Points for K-Means Clustering

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning

Document Clustering Description Extraction and Its Application

ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
Finding natural clusters using multi-clusterer combiner based on shared nearest neighbors

MCS'03 Proceedings of the 4th international conference on Multiple classifier systems
Clustering algorithm based on the combination of genetic algorithm and ant colony algorithm

Proceedings of the 2011 International Conference on Innovative Computing and Cloud Computing
Topic discovery from document using ant-based clustering combination

APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
On the use of consensus clustering for incremental learning of topic hierarchies

SBIA'12 Proceedings of the 21st Brazilian conference on Advances in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cluster analysis is an un-supervised learning technique that is widely used in the process of topic discovery from text. The research presented here proposes a novel un-supervised learning approach based on aggregation of clusterings produced by different clustering techniques. By examining and combining two different clusterings of a document collection, the aggregation aims at revealing a better structure of the data rather than imposing one that is imposed or constrained by the clustering method itself. When clusters of documents are formed, a process called topic extraction picks terms from the feature space (i.e. the vocabulary of the whole collection) to describe the topic of each cluster. It is proposed at this stage to re-compute terms weights according to the revealed cluster structure. The work further investigates the adaptive setup of the parameters required for the clustering and aggregation techniques. Finally, a topic accuracy measure is developed and used along with the F-measure to evaluate and compare the extracted topics and the clustering quality (respectively) before and after the aggregation. Experimental evaluation shows that the aggregation can successfully improve the clustering quality and the topic accuracy over individual clustering techniques.