Adaptive Hausdorff distances and dynamic clustering of symbolic interval data
Pattern Recognition Letters
Pattern Recognition Letters
Hi-index | 0.00 |
As a new kind of data mining method, symbolic data analysis (SDA) can not only decrease the computational complexity of huge data, but also master the property of the sample integrally by data package technology. Interval number is one of the most important types of symbolic data. Previous studies assumed each individual to be uniformly distributed within the interval, but the fact is not so. Non-uniform interval symbolic data is defined in this paper, and the study is concentrated on their descriptive univariate statistics and bivariate statistics. On the basis of the study on empirical distribution function for non-uniform interval symbolic data, the calculation formula of mean and variance of non-uniform interval variables is achieved. Furthermore, covariance and correlation coefficient between two non-uniform interval variables are solved based on their empirical joint distribution function. Finally an example is given.