Outlier detection for high dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A human-computer cooperative system for effective high dimensional clustering
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Towards systematic design of distance functions for data mining applications
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Generative model-based clustering of directional data
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Subspace clustering for high dimensional data: a review
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
An effective and efficient algorithm for high-dimensional outlier detection
The VLDB Journal — The International Journal on Very Large Data Bases
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Feature bagging for outlier detection
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
On the use of Human-Computer Interaction for Projected Nearest Neighbor Search
Data Mining and Knowledge Discovery
ACM Transactions on Database Systems (TODS)
The Concentration of Fractional Distances
IEEE Transactions on Knowledge and Data Engineering
Outlier detection in sensor networks
Proceedings of the 8th ACM international symposium on Mobile ad hoc networking and computing
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Incremental clustering of dynamic data streams using connectivity based representative points
Data & Knowledge Engineering
A flexible framework to ease nearest neighbor search in multidimensional data spaces
Data & Knowledge Engineering
Data mining of vector–item patterns using neighborhood histograms
Knowledge and Information Systems
Boosting support vector machines using multiple dissimilarities
KES'07/WIRN'07 Proceedings of the 11th international conference, KES 2007 and XVII Italian workshop on neural networks conference on Knowledge-based intelligent information and engineering systems: Part I
A partially supervised metric multidimensional scaling algorithm for textual data visualization
IDA'07 Proceedings of the 7th international conference on Intelligent data analysis
On the combination of dissimilarities for gene expression data analysis
ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Can shared-neighbor distances defeat the curse of dimensionality?
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Subspace similarity search: efficient k-NN queries in arbitrary subspaces
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Electrostatic field framework for supervised and semi-supervised learning from incomplete data
Natural Computing: an international journal
Applying instance-based techniques to prediction of final outcome in acute stroke
Artificial Intelligence in Medicine
A survey on unsupervised outlier detection in high-dimensional numerical data
Statistical Analysis and Data Mining
On the equivalence of PLSI and projected clustering
ACM SIGMOD Record
Context-aware hybrid reasoning framework for pervasive healthcare
Personal and Ubiquitous Computing
Hi-index | 0.00 |
In recent years, the detrimental effects of the curse of high dimensionality have been studied in great detail on several problems such as clustering, nearest neighbor search, and indexing. In high dimensional space the data becomes sparse, and traditional indexing and algorithmic techniques fail from the performance perspective. Recent research results show that in high dimensional space, the concept of proximity may not even be qualitatively meaningful [6]. In this paper, we try to outline the effects of generalizing low dimensional techniques to high dimensional applications and the natural effects of sparsity on distance based applications. We outline the guidelines required in order to re-design either the distance functions or the distance-based applications in a meaningful way for high dimensional domains. We provide novel perspectives and insights on some new lines of work for broadening application definitions in order to effectively deal with the dimensionality curse.