BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
A cost model for nearest neighbor search in high-dimensional data space
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
The pyramid-technique: towards breaking the curse of dimensionality
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Spatial join selectivity using power laws
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Using the fractal dimension to cluster datasets
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Computer
Data Mining: An Overview from a Database Perspective
IEEE Transactions on Knowledge and Data Engineering
Finding Aggregate Proximity Relationships and Commonalities in Spatial Data Mining
IEEE Transactions on Knowledge and Data Engineering
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Estimating the Selectivity of Spatial Queries Using the `Correlation' Fractal Dimension
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Spatial Data Mining: A Database Approach
SSD '97 Proceedings of the 5th International Symposium on Advances in Spatial Databases
Constraint-Based Rule Mining in Large, Dense Databases
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Deflating the Dimensionality Curse Using Multiple Fractal Dimensions
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Requirements for clustering data streams
ACM SIGKDD Explorations Newsletter
"GeoPlot": spatial data mining on video libraries
Proceedings of the eleventh international conference on Information and knowledge management
A fast and effective method to find correlations among attributes in databases
Data Mining and Knowledge Discovery
LearnMet: learning domain-specific distance metrics for plots of scientific functions
Multimedia Tools and Applications
Component Selection to Optimize Distance Function Learning in Complex Scientific Data Sets
DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Measuring evolving data streams' behavior through their intrinsic dimension
New Generation Computing
Mining images of material nanostructure data
ICDCIT'06 Proceedings of the Third international conference on Distributed Computing and Internet Technology
WISI'06 Proceedings of the 2006 international conference on Intelligence and Security Informatics
Hi-index | 0.00 |
We focus on the problem of finding patterns across two large, multidimensional datasets. For example, given feature vectors of healthy and of non-healthy patients, we want to answer the following questions: Are the two clouds of points separable? What is the smallest/largest pair-wise distance across the two datasets? Which of the two clouds does a new point (feature vector) come from?We propose a new tool, the tri-plot, and its generalization, the pq-plot, which help us answer the above questions. We provide a set of rules on how to interpret a tri-plot, and we apply these rules on synthetic and real datasets. We also show how to use our tool for classification, when traditional methods (nearest neighbor, classification trees) may fail.