Computational geometry: an introduction
Computational geometry: an introduction
Discrete mathematics
Optimal file distribution for partial match retrieval
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
A retrieval technique for similar shapes
SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Molecular docking using shape descriptors
Journal of Computational Chemistry
Techniques for automatically correcting words in text
ACM Computing Surveys (CSUR)
Efficient and effective querying by image content
Journal of Intelligent Information Systems - Special issue: advances in visual information management systems
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Nearest neighbor searching and applications
Nearest neighbor searching and applications
A cost model for nearest neighbor search in high-dimensional data space
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Feature-index-based similar shape retrieval
Proceedings of the third IFIP WG2.6 working conference on Visual database systems 3 (VDB-3)
Disk allocation for Cartesian product files on multiple-disk systems
ACM Transactions on Database Systems (TODS)
An Algorithm for Finding Best Matches in Logarithmic Expected Time
ACM Transactions on Mathematical Software (TOMS)
The TV-tree: an index structure for high-dimensional data
The VLDB Journal — The International Journal on Very Large Data Bases - Spatial Database Systems
Feature-Based Retrieval of Similar Shapes
Proceedings of the Ninth International Conference on Data Engineering
The X-tree: An Index Structure for High-Dimensional Data
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
BOUNDS ON INFORMATION RETRIEVAL EFFICIENCY IN STATIC FILE STRUCTURES.
BOUNDS ON INFORMATION RETRIEVAL EFFICIENCY IN STATIC FILE STRUCTURES.
S3: similarity search in CAD database systems
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
The pyramid-technique: towards breaking the curse of dimensionality
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Optimal multi-step k-nearest neighbor search
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Similarity query processing using disk arrays
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient disk allocation for fast similarity searching
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Clustering and singular value decomposition for approximate indexing in high dimensional spaces
Proceedings of the seventh international conference on Information and knowledge management
Enhanced nearest neighbour search on the R-tree
ACM SIGMOD Record
Clustering declustered data for efficient retrieval
Proceedings of the eighth international conference on Information and knowledge management
(Almost) optimal parallel block access to range queries
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Multidimensional Index Structures in Relational Databases
Journal of Intelligent Information Systems - Data warehousing and knowledge discovery
A cost model for query processing in high dimensional data spaces
ACM Transactions on Database Systems (TODS)
Using Hilbert curve in image storing and retrieving
MULTIMEDIA '00 Proceedings of the 2000 ACM workshops on Multimedia
Scalable integrated region-based image retrieval using IRM and statistical clustering
Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Effective nearest neighbor indexing with the euclidean metric
Proceedings of the tenth international conference on Information and knowledge management
ACM Computing Surveys (CSUR)
Similarity-based algebra for multimedia database systems
ADC '01 Proceedings of the 12th Australasian database conference
Similarity based retrieval from sequence databases using automata as queries
Proceedings of the eleventh international conference on Information and knowledge management
An Enhanced Technique for k-Nearest Neighbor Queries with Non-Spatial Selection Predicates
Multimedia Tools and Applications
A Multistep Approach for Shape Similarity Search in Image Databases
IEEE Transactions on Knowledge and Data Engineering
Indexing the Solution Space: A New Technique for Nearest Neighbor Search in High-Dimensional Space
IEEE Transactions on Knowledge and Data Engineering
Multiple Similarity Queries: A Basic DBMS Operation for Mining in Metric Databases
IEEE Transactions on Knowledge and Data Engineering
On the 'Dimensionality Curse' and the 'Self-Similarity Blessing'
IEEE Transactions on Knowledge and Data Engineering
Trading Quality for Time with Nearest Neighbor Search
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Active File Systems for Data Mining and Multimedia
HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
When Is ''Nearest Neighbor'' Meaningful?
ICDT '99 Proceedings of the 7th International Conference on Database Theory
Asymptotically Optimal Declustering Schemes for Range Queries
ICDT '01 Proceedings of the 8th International Conference on Database Theory
On Optimizing Nearest Neighbor Queries in High-Dimensional Data Spaces
ICDT '01 Proceedings of the 8th International Conference on Database Theory
A Parallel Similarity Search in High Dimensional Metric Space Using M-Tree
IWCC '01 Proceedings of the NATO Advanced Research Workshop on Advanced Environments, Tools, and Applications for Cluster Computing-Revised Papers
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Improving Adaptable Similarity Query Processing by Using Approximations
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Contrast Plots and P-Sphere Trees: Space vs. Time in Nearest Neighbour Searches
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Efficient User-Adaptable Similarity Search in Large Multimedia Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Similarity-Based Operators in Image Database Systems
WAIM '01 Proceedings of the Second International Conference on Advances in Web-Age Information Management
Implementation of Multidimensional Index Structures for Knowledge Discovery in Relational Databases
DaWaK '99 Proceedings of the First International Conference on Data Warehousing and Knowledge Discovery
Interactive-Time Similarity Search for Large Image Collections Using Parallel VA-Files
ECDL '00 Proceedings of the 4th European Conference on Research and Advanced Technology for Digital Libraries
A Content-Based Approach to Searching and Indexing Spatial Configurations
GIScience '02 Proceedings of the Second International Conference on Geographic Information Science
3D Shape Histograms for Similarity Search and Classification in Spatial Databases
SSD '99 Proceedings of the 6th International Symposium on Advances in Spatial Databases
Optimal Parallel I/O for Range Queries through Replication
DEXA '02 Proceedings of the 13th International Conference on Database and Expert Systems Applications
Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances
The VLDB Journal — The International Journal on Very Large Data Bases
Using Hilbert curve in image storing and retrieving
Information Systems
Multidimensional Declustering Schemes Using Golden Ratio and Kronecker Sequences
IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Knowledge and Data Engineering
Active network file system for data mining and multimedia
ICCC '02 Proceedings of the 15th international conference on Computer communication
Disk Allocation for Fast Range and Nearest-Neighbor Queries
Distributed and Parallel Databases
Novel indexing method of relations between salient objects
Effective databases for text & document management
An Efficient Technique for Nearest-Neighbor Query Processing on the SPY-TEC
IEEE Transactions on Knowledge and Data Engineering
Integrating similarity-based queries in image DBMSs
Proceedings of the 2004 ACM symposium on Applied computing
On efficiently processing nearest neighbor queries in a loosely coupled set of data sources
Proceedings of the 12th annual ACM international workshop on Geographic information systems
Iterative-improvement-based declustering heuristics for multi-disk databases
Information Systems
Replicated declustering of spatial data
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Optimal data-space partitioning of spatial data for parallel I/O
Distributed and Parallel Databases
Fast estimation of fractal dimension and correlation integral on stream data
Information Processing Letters
Efficient retrieval of replicated data
Distributed and Parallel Databases
Efficient parallel processing of range queries through replicated declustering
Distributed and Parallel Databases
Efficient processing of complex similarity queries in RDBMS through query rewriting
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Data space mapping for efficient I/O in large multi-dimensional databases
Information Systems
Information Sciences: an International Journal
Proceedings of the 2007 ACM symposium on Applied computing
Laplace spectra as fingerprints for image recognition
Computer-Aided Design
The Concentration of Fractional Distances
IEEE Transactions on Knowledge and Data Engineering
Efficient Processing of Nearest Neighbor Queries in Parallel Multimedia Databases
DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Multimedia Tools and Applications
Optimal K-Nearest-Neighbor Query in Data Grid
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Divide-and-conquer scheme for strictly optimal retrieval of range queries
ACM Transactions on Storage (TOS)
Fast estimation of fractal dimension and correlation integral on stream data
Information Processing Letters
Preface to the 2nd international workshop on unstructured data management (USDM 2011)
APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
Batch text similarity search with MapReduce
APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
Minimizing the search space for shape retrieval algorithms
ISCIS'06 Proceedings of the 21st international conference on Computer and Information Sciences
An index structure for parallel processing of multidimensional data
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Efficient parallel processing for K-nearest-neighbor search in spatial databases
ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part V
Threshold based declustering in high dimensions
DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
Automatic image description based on textual data
Journal on Data Semantics VII
Large-scale similarity-based join processing in multimedia databases
MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
A survey on unsupervised outlier detection in high-dimensional numerical data
Statistical Analysis and Data Mining
A data allocation method for efficient content-based retrieval in parallel multimedia databases
ISPA'07 Proceedings of the 2007 international conference on Frontiers of High Performance Computing and Networking
Towards a universal tracking database
Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Efficient and robust large medical image retrieval in mobile cloud computing environment
Information Sciences: an International Journal
Hi-index | 0.00 |
Most similarity search techniques map the data objects into some high-dimensional feature space. The similarity search then corresponds to a nearest-neighbor search in the feature space which is computationally very intensive. In this paper, we present a new parallel method for fast nearest-neighbor search in high-dimensional feature spaces. The core problem of designing a parallel nearest-neighbor algorithm is to find an adequate distribution of the data onto the disks. Unfortunately, the known declustering methods to not perform well for high-dimensional nearest-neighbor search. In contrast, our method has been optimized based on the special properties of high-dimensional spaces and therefore provides a near-optimal distribution of the data items among the disks. The basic idea of our data declustering technique is to assign the buckets corresponding to different quadrants of the data space to different disks. We show that our technique - in contrast to other declustering methods - guarantees that all buckets corresponding to neighboring quadrants are assigned to different disks. We evaluate our method using large amounts of real data (up to 40 MBytes) and compare it with the best known data declustering method, the Hilbert curve. Our experiments show that our method provides an almost linear speed-up and a constant scale-up. Additionally, it outperforms the Hilbert approach by a factor of up to 5.