Window query-optimal clustering of spatial objects
PODS '95 Proceedings of the fourteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Optimal multi-step k-nearest neighbor search
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Dimensionality reduction for similarity searching in dynamic databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
On the effects of dimensionality reduction on high dimensional similarity search
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
ACM Computing Surveys (CSUR)
Efficient Similarity Search In Sequence Databases
FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
When Is ''Nearest Neighbor'' Meaningful?
ICDT '99 Proceedings of the 7th International Conference on Database Theory
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Dimensionality Reduction and Similarity Computation by Inner-Product Approximations
IEEE Transactions on Knowledge and Data Engineering
An effective method for approximating the euclidean distance in high-dimensional space
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Data pre-processing: a new algorithm for feature selection and data discretization
CSTST '08 Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology
Hi-index | 0.00 |
This paper proposes a novel method for dimensionality reduction based on a function approximating the Euclidean distance, which makes use of the norm and angle components of a vector. First, we identify the causes of errors in angle estimation for approximating the Euclidean distance, and discuss basic solutions to reduce those errors. Then, we propose a new method for dimensionality reduction that composes a set of subvectors from a feature vector and maintains only the norm and the estimated angle for every subvector. The selection of a good reference vector is important for accurate estimation of the angle component. We present criteria for being a good reference vector, and propose a method that chooses a good reference vector by using the Levenberg-Marquardt algorithm. Also, we define a novel distance function, and formally prove that the distance function consistently lower-bounds the Euclidean distance. This implies that our approach does not incur any false dismissals in reducing the dimensionality. Finally, we verify the superiority of the proposed approach via performance evaluation with extensive experiments.