Advances in knowledge discovery and data mining
Advances in knowledge discovery and data mining
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Unsupervised Texture Segmentation in a Deterministic Annealing Framework
IEEE Transactions on Pattern Analysis and Machine Intelligence
Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Partitioning-based clustering for Web document categorization
Decision Support Systems - Special issue on WITS '97
Proceedings of the 1998 conference on Advances in neural information processing systems II
Agglomerative clustering of a search engine query log
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: concepts and techniques
Data mining: concepts and techniques
Data Mining Techniques: For Marketing, Sales, and Customer Support
Data Mining Techniques: For Marketing, Sales, and Customer Support
Clustering Categorical Data: An Approach Based on Dynamical Systems
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
Correlation-based Document Clustering using Web Logs
HICSS '01 Proceedings of the 34th Annual Hawaii International Conference on System Sciences ( HICSS-34)-Volume 5 - Volume 5
ROCK: A Robust Clustering Algorithm for Categorical Attributes
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Associativity based clustering algorithm in mobile ad hoc networks
ICCOMP'07 Proceedings of the 11th WSEAS International Conference on Computers
The cluster-abstraction model: unsupervised learning of topic hierarchies from text data
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Architecting for next generation business applications
ICCOMP'06 Proceedings of the 10th WSEAS international conference on Computers
Architectural representations for describing enterprise information and data
ICCOMP'06 Proceedings of the 10th WSEAS international conference on Computers
IEEE Transactions on Information Theory
IEEE Transactions on Information Theory
Hi-index | 0.00 |
Nowadays, we have to deal with a large quantity of unstructured data, produced by a number of sources. For example, clustering web pages is essential to getting structured information in response to user queries. In this paper, we intend to test the results of a new clustering technique -- clustering by compression -- when applied to heterogeneous sets of data. The clustering by compression procedure is based on a parameterfree, universal, similarity distance, the normalized compression distance or NCD, computed from the lengths of compressed data files (singly and in pair-wise concatenation). Compression algorithms allow defining a similarity measure based on the degree of common information, whereas clustering methods allow clustering similar data without any previous knowledge.