BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
The pyramid-technique: towards breaking the curse of dimensionality
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
ACM Computing Surveys (CSUR)
Indexing the edges—a simple and yet efficient approach to high-dimensional indexing
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
ACM Computing Surveys (CSUR)
The K-D-B-tree: a search structure for large multidimensional dynamic indexes
SIGMOD '81 Proceedings of the 1981 ACM SIGMOD international conference on Management of data
A retrieval technique for high-dimensional data and partially specified queries
Data & Knowledge Engineering
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Trading Quality for Time with Nearest Neighbor Search
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
STING: A Statistical Information Grid Approach to Spatial Data Mining
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Coordinating Simultaneous Caching of File Bundles from Tertiary Storage
SSDBM '00 Proceedings of the 12th International Conference on Scientific and Statistical Database Management
The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Clustering High Dimensional Massive Scientific Datasets
SSDBM '01 Proceedings of the 13th International Conference on Scientific and Statistical Database Management
Journal of Systems and Software - Special issue: Performance modeling and analysis of computer systems and networks
Clustering high-dimensional data using an efficient and effective data space reduction
Proceedings of the 14th ACM international conference on Information and knowledge management
An efficient multi-tier tablet server storage architecture
Proceedings of the 2nd ACM Symposium on Cloud Computing
Hi-index | 0.00 |
When data resides on tertiary storage, clustering is the key to achieving high retrieval performance. However, a straightforward approach to clustering massive amounts of data on this storage requires considerable computational and storage resources that usually exceed the capabilities of even the richest super-computing centers. This paper develops a new approach to hierarchical storage management in Data Grid environments, which calls for two levels of clustering data on tertiary storage. Applying a mix of static and dynamic decisions, this approach achieves the benefits of clustering at reasonable costs. However, an effectiverealization of the approach in generic Data Grid environments requires advances in the areas of indexing and clustering large scientific data collections on tertiary storage. The paper describes some novel indexing and clustering techniques that can cope well not only with extremely large volumes but also with very high dimensionalities of scientific data. The basic principles of a new clustering technique for large volumes of multi-dimensional data are introduced in the paper for the first time.