Comparing relational and non-relational algorithms for clustering propositional data

Authors:
Robson Motta;Alneu de Andrade Lopes;Bruno M. Nogueira;Solange O. Rezende;Alípio M. Jorge;Maria Cristina Ferreira de Oliveira
Affiliations:
VICG, ICMC, University of Sao Paulo, Sao Carlos, SP, Brazil;LABIC, ICMC, University of Sao Paulo, Sao Carlos, SP, Brazil;LABIC, ICMC, University of Sao Paulo, Sao Carlos, SP, Brazil;LABIC, ICMC, University of Sao Paulo, Sao Carlos, SP, Brazil;LIAAD - INESC TEC, DCC, FCUP, University of Porto, Portugal;VICG, ICMC, University of Sao Paulo, Sao Carlos, SP, Brazil
Venue:
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Year:
2013

Citing 15
Cited 0

Algorithms for clustering data

Algorithms for clustering data
Fast and effective text mining using linear-time document clustering

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Data clustering: a review

ACM Computing Surveys (CSUR)
Empirical and Theoretical Comparisons of Selected Criterion Functions for Document Clustering

Machine Learning
Cluster Analysis for Gene Expression Data: A Survey

IEEE Transactions on Knowledge and Data Engineering
Combining Partitional and Hierarchical Algorithms for Robust and Efficient Data Clustering with Cohesion Self-Merging

IEEE Transactions on Knowledge and Data Engineering
A comparative analysis on the bisecting K-means and the PDDP clustering algorithms

Intelligent Data Analysis
Centrality Measures from Complex Networks in Active Learning

DS '09 Proceedings of the 12th International Conference on Discovery Science
Clustering of time series data-a survey

Pattern Recognition
Data clustering: 50 years beyond K-means

Pattern Recognition Letters
Empirical comparison of algorithms for network community detection

Proceedings of the 19th international conference on World wide web
Relative clustering validity criteria: A comparative overview

Statistical Analysis and Data Mining
Minimum spanning tree based split-and-merge: A hierarchical clustering method

Information Sciences: an International Journal
Fast approximate similarity search based on degree-reduced neighborhood graphs

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Survey of clustering algorithms

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cluster detection methods are widely studied in Propositional Data Mining. In this context, data is individually represented as a feature vector. This data has a natural non-relational structure, but can be represented in a relational form through similarity-based network models. In these models, examples are represented by vertices and an edge connects two examples with high similarity. This relational representation allows employing network-based algorithms in Relational Data Mining. Specifically in clustering tasks, these models allow to use community detection algorithms in networks in order to detect data clusters. In this work, we compared traditional non-relational data-based clustering algorithms with clustering detection algorithms based on relational data using measures for community detection in networks. We carried out an exploratory analysis over 23 numerical datasets and 10 textual datasets. Results show that network models can efficiently represent the data topology, allowing their application in cluster detection with higher precision when compared to non-relational methods.