Scheduling algorithms for modern disk drives
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Research problems in data warehousing
CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
On-line extraction of SCSI disk drive parameters
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
An overview of data warehousing and OLAP technology
ACM SIGMOD Record
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A case for intelligent disks (IDISKs)
ACM SIGMOD Record
BIRCH: A New Data Clustering Algorithm and Its Applications
Data Mining and Knowledge Discovery
Parallel Mining of Association Rules
IEEE Transactions on Knowledge and Data Engineering
Ratio Rules: A New Paradigm for Fast, Quantifiable Data Mining
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Active Storage for Large-Scale Data Mining and Multimedia
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Managing Memory to Meet Multiclass Workload Response Time Goals
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Distributed Computing with Load-Managed Active Storage
HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
DRPM: dynamic speed control for power management in server class disks
Proceedings of the 30th annual international symposium on Computer architecture
Memory-adative association rules mining
Information Systems - Databases: Creation, management and utilization
Adaptive, unsupervised stream mining
The VLDB Journal — The International Journal on Very Large Data Bases
Systems Support for Preemptive Disk Scheduling
IEEE Transactions on Computers
Design and Implementation of Semi-preemptible IO
FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Intelligent storage: Cross-layer optimization for soft real-time workload
ACM Transactions on Storage (TOS)
The leganet system: Freshness-aware transaction routing in a database cluster
Information Systems
Towards higher disk head utilization: extracting free bandwidth from busy disk drives
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Operating system management of MEMS-based storage devices
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Adaptive, hands-off stream mining
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Lachesis: robust database storage management based on device-specific performance characteristics
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
NCQ vs. I/O scheduler: Preventing unexpected misbehaviors
ACM Transactions on Storage (TOS)
Freeblock scheduling outside of disk firmware
FAST'02 Proceedings of the 1st USENIX conference on File and storage technologies
Design and implementation of semi-preemptible IO
FAST'03 Proceedings of the 2nd USENIX conference on File and storage technologies
Survey and analysis of disk scheduling methods
ACM SIGARCH Computer Architecture News
Hi-index | 0.00 |
This paper proposes a scheme for scheduling disk requests that takes advantage of the ability of high-level functions to operate directly at individual disk drives. We show that such a scheme makes it possible to support a Data Mining workload on an OLTP system almost for free: there is only a small impact on the throughput and response time of the existing workload. Specifically, we show that an OLTP system has the disk resources to consistently provide one third of its sequential bandwidth to a background Data Mining task with close to zero impact on OLTP throughput and response time at high transaction loads. At low transaction loads, we show much lower impact than observed in previous work. This means that a production OLTP system can be used for Data Mining tasks without the expense of a second dedicated system. Our scheme takes advantage of close interaction with the on-disk scheduler by reading blocks for the Data Mining workload as the disk head “passes over” them while satisfying demand blocks from the OLTP request stream. We show that this scheme provides a consistent level of throughput for the background workload even at very high foreground loads. Such a scheme is of most benefit in combination with an Active Disk environment that allows the background Data Mining application to also take advantage of the processing power and memory available directly on the disk drives.