Approximating block accesses in database organizations
Information Processing Letters
On estimating access costs in relational databases
Information Processing Letters
Choice of the optimal number of blocks for data access by an index
Information Systems
A general framework for computing block accesses
Information Systems
Physical database design for relational databases
ACM Transactions on Database Systems (TODS)
The effect of buffer size on pages accessed in random files
Information Systems
Statistical profile estimation in database systems
ACM Computing Surveys (CSUR)
Index scans using a finite LRU buffer: a validated I/O model
ACM Transactions on Database Systems (TODS)
Optimization Strategies for Relational Queries
IEEE Transactions on Software Engineering
ACM Computing Surveys (CSUR)
Estimating block accesses in database organizations: a closed noniterative formula
Communications of the ACM
Approximating block accesses in database organizations
Communications of the ACM
Analysis and performance of inverted data base structures
Communications of the ACM
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Block Access Estimation for Clustered Data
IEEE Transactions on Knowledge and Data Engineering
The index suggestion problem for object database applications
CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
IEEE Transactions on Knowledge and Data Engineering
Hi-index | 0.00 |
Data access cost evaluation is fundamental in the design and management of database systems. When some data items have duplicates, a clustering effect that can heavily influence access costs is observed. The availability of a finite amount of buffer memory in real systems has an even more dramatic impact. A comprehensive cost model for clustered data retrieval by an index using a finite buffer is presented. The approach combines and extends previous models based either on finite buffer or on uniform data clustering assumptions. The computational costs of the formulas proposed are independent of the data size or of the query cardinality and need only a single statistics per search key, the clustering factor, to be maintained by the system. The predictive power and the accuracy of the model are shown in comparison with actual costs resulting from simulations.