Guidelines for presentation and comparison of indexing techniques
ACM SIGMOD Record
Effective nearest neighbor indexing with the euclidean metric
Proceedings of the tenth international conference on Information and knowledge management
ACM Computing Surveys (CSUR)
Fast Nearest Neighbor Search in High-Dimensional Space
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
QBISM: Extending a DBMS to Support 3D Medical Images
Proceedings of the Tenth International Conference on Data Engineering
When Is ''Nearest Neighbor'' Meaningful?
ICDT '99 Proceedings of the 7th International Conference on Database Theory
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Dimensionality reduction using magnitude and shape approximations
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
An effective method for approximating the euclidean distance in high-dimensional space
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Hi-index | 0.00 |
Previous researches on multidimensional indexes typically have used synthetic data sets distributed uniformly or normally over multidimensional space for performance evaluation. These kinds of data sets hardly reflect the characteristics of multimedia database applications. In this paper, we discuss issues on generating high dimensional data and query sets for resolving the problem. We first identify the requirements of the data and query sets for fair performance evaluation of multidimensional indexes, and then propose HDDQ_Gen (High-Dimensional Data and Query Generator) that satisfies such requirements. HDDQ_Gen has the following features: (1) clustered distribution, (2) various object distribution in each cluster, (3) various cluster distribution, (4) various correlations among different dimensions, and (5) query distribution depending on data distribution. Using these features, users are able to control the distribution characteristics of data and query sets appropriate for their target applications.