Introduction to statistical pattern recognition (2nd ed.)
Introduction to statistical pattern recognition (2nd ed.)
The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Efficient and effective querying by image content
Journal of Intelligent Information Systems - Special issue: advances in visual information management systems
Fast subsequence matching in time-series databases
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
The SR-tree: an index structure for high-dimensional nearest neighbor queries
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A cost model for nearest neighbor search in high-dimensional data space
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
On power-law relationships of the Internet topology
Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
Modern Information Retrieval
The convex polyhedra technique: an index structure for high-dimensional space
ADC '02 Proceedings of the 13th Australasian database conference - Volume 5
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
C2VA: Trim High Dimensional Indexes
WAIM '02 Proceedings of the Third International Conference on Advances in Web-Age Information Management
ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Effectiveness of NAQ-tree as index structure for similarity search in high-dimensional metric space
Knowledge and Information Systems
Efficient histogram-based similarity search in ultra-high dimensional space
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications: Part II
A new indexing method for high dimensional dataset
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Approximate high-dimensional nearest neighbor queries using R-forests
Proceedings of the 17th International Database Engineering & Applications Symposium
Hi-index | 0.00 |
Similarity search is important in information-retrieval applications where objects are usually represented as vectors of high dimensionality. This paper proposes a new dimensionality-reduction technique and an indexing mechanism for high-dimensional datasets. The proposed technique reduces the dimensions for which coordinates are less than a critical value with respect to each data vector. This flexible datawise dimensionality reduction contributes to improving indexing mechanisms for high-dimensional datasets that are in skewed distributions in all coordinates. To apply the proposed technique to information retrieval, a CVA file (compact VA file), which is a revised version of the VA file is developed. By using a CVA file, the size of index files is reduced further, while the tightness of the index bounds is held maximally. The effectiveness is confirmed by synthetic and real data.