Dimensionality reduction using magnitude and shape approximations

Authors:
Ümit Y. Ogras;Hakan Ferhatosmanoglu
Affiliations:
The Ohio State University;The Ohio State University
Venue:
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Year:
2003

Citing 32
Cited 7

Discrete-time signal processing

Discrete-time signal processing
The design and analysis of spatial data structures

The design and analysis of spatial data structures
Improving text retrieval for the routing problem using latent semantic indexing

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient and effective querying by image content

Journal of Intelligent Information Systems - Special issue: advances in visual information management systems
Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Digital image processing

Digital image processing
Efficient retrieval for browsing large image databases

CIKM '96 Proceedings of the fifth international conference on Information and knowledge management
The SR-tree: an index structure for high-dimensional nearest neighbor queries

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
S3: similarity search in CAD database systems

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A cost model for nearest neighbor search in high-dimensional data space

PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Principles of multimedia database systems

Principles of multimedia database systems
Optimal multi-step k-nearest neighbor search

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Dimensionality reduction for similarity searching in dynamic databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Multidimensional access methods

ACM Computing Surveys (CSUR)
Active disks: programming model, algorithms and evaluation

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
The Asilomar report on database research

ACM SIGMOD Record
Dimensionality reduction and similarity computation by inner product approximations

Proceedings of the ninth international conference on Information and knowledge management
Locally adaptive dimensionality reduction for indexing large time series databases

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases

ACM Computing Surveys (CSUR)
Quadtree and R-tree indexes in oracle spatial: a comparison using GIS data

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Digital Coding of Waveforms: Principles and Applications to Speech and Video

Digital Coding of Waveforms: Principles and Applications to Speech and Video
The TV-tree: an index structure for high-dimensional data

The VLDB Journal — The International Journal on Very Large Data Bases - Spatial Database Systems
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Approximate Nearest Neighbor Searching in Multimedia Databases

Proceedings of the 17th International Conference on Data Engineering
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Similarity Search in High Dimensions via Hashing

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Fast Nearest Neighbor Search in Medical Image Databases

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Efficient User-Adaptable Similarity Search in Large Multimedia Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Non-linear dimensionality reduction techniques for classification and visualization

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
How to Approximate the Inner-product: Fast Dynamic Algorithms for Euclidean Similarity

How to Approximate the Inner-product: Fast Dynamic Algorithms for Euclidean Similarity

High dimensional nearest neighbor searching

Information Systems
Generating High Dimensional Data and Query Sets

SOFSEM '07 Proceedings of the 33rd conference on Current Trends in Theory and Practice of Computer Science
Approximate similarity search in metric spaces using inverted files

Proceedings of the 3rd international conference on Scalable information systems
Dimensionality reduction for similarity search with the Euclidean distance in high-dimensional applications

Multimedia Tools and Applications
An approach to content-based image retrieval based on the Lucene search engine library

ECDL'10 Proceedings of the 14th European conference on Research and advanced technology for digital libraries
A fast image retrieval using the unification search method of binary classification and dimensionality condensation of feature vectors

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III
An effective method for approximating the euclidean distance in high-dimensional space

DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

High dimensional data sets are encountered in many modern database applications. The usual approach is to construct a summary of the data set through a lossy compression technique, and use this lower dimensional synopsis to provide fast, approximate answers to the queries. In this paper, we develop a novel dimensionality reduction technique based on partitioning the high dimensional vector space into orthogonal subspaces. First, we find a relation between the Euclidian distance of two n-dimensional vectors and the Euclidian distances of their projections on the orthogonal subspaces. Then, based on this relation we develop a method to approximate the Euclidian distance using novel inner product approximation. This process allows us to incorporate the shape information of the vectors to this approximation. While the inner product approximation is symmetric, i.e., captures only the magnitude information of the data, the proposed method takes both the magnitude and shape information of the original vectors into account through partitioning. In the experiments, we demonstrate the effectiveness of our technique by comparing it with commonly used methods.