A neighborhood-based clustering by means of the triangle inequality

Authors:
Marzena Kryszkiewicz;Piotr Lasek
Affiliations:
Institute of Computer Science, Warsaw University of Technology, Warsaw, Poland;Institute of Computer Science, Warsaw University of Technology, Warsaw, Poland
Venue:
IDEAL'10 Proceedings of the 11th international conference on Intelligent data engineering and automated learning
Year:
2010

Citing 5
Cited 4

The SEQUOIA 2000 storage benchmark

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
BIRCH: A New Data Clustering Algorithm and Its Applications

Data Mining and Knowledge Discovery
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
TI-DBSCAN: clustering with DBSCAN by means of the triangle inequality

RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
A neighborhood-based clustering algorithm

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

High scent web page recommendations using fuzzy rough set attribute reduction

Transactions on rough sets XIV
Neighborhood based clustering method for arbitrary shaped clusters

ISMIS'11 Proceedings of the 19th international conference on Foundations of intelligent systems
Efficient determination of binary non-negative vector neighbors with regard to cosine similarity

IEA/AIE'12 Proceedings of the 25th international conference on Industrial Engineering and Other Applications of Applied Intelligent Systems: advanced research in applied artificial intelligence
Using Non-Zero Dimensions for the Cosine and Tanimoto Similarity Search Among Real Valued Vectors

Fundamenta Informaticae - To Andrzej Skowron on His 70th Birthday

Quantified Score

Hi-index	0.00

Visualization

Abstract

Grouping data into meaningful clusters is an important task of both artificial intelligence and data mining. An important group of clustering algorithms are density based ones that require calculation of a neighborhood of a given data point. The bottleneck for such algorithms are high dimensional data. In this paper, we propose a new TI-k-Neighborhood-Index algorithm that calculates k-neighborhoods for all points in a given data set by means the triangle inequality. We prove experimentally that the NBC (Neighborhood Based Clustering) clustering algorithm supported by our index outperforms NBC supported by such known spatial indices as VA-file and R-tree both in the case of low and high dimensional data.