A measure of transaction processing power
Datamation
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Interpreting the data: Parallel analysis with Sawzall
Scientific Programming - Dynamic Grids and Worldwide Computing
Power provisioning for a warehouse-sized computer
Proceedings of the 34th annual international symposium on Computer architecture
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
DataSeries: an efficient, flexible data format for structured serial data
ACM SIGOPS Operating Systems Review
A comparison of approaches to large-scale data analysis
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
HotOS'09 Proceedings of the 12th conference on Hot topics in operating systems
Automatic optimization for MapReduce programs
Proceedings of the VLDB Endowment
An efficient multi-tier tablet server storage architecture
Proceedings of the 2nd ACM Symposium on Cloud Computing
Improving per-node efficiency in the datacenter with new OS abstractions
Proceedings of the 2nd ACM Symposium on Cloud Computing
Parallel data processing with MapReduce: a survey
ACM SIGMOD Record
CloudRAMSort: fast and efficient large-scale distributed RAM sort on shared-nothing cluster
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Understanding and improving the cost of scaling distributed event processing
Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
Themis: an I/O-efficient MapReduce
Proceedings of the Third ACM Symposium on Cloud Computing
Zone-based data striping for cloud storage
IBM Journal of Research and Development
Unified high-performance I/O: one stack to rule them all
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
Hi-index | 0.00 |
Current data intensive scalable computing (DISC) systems, although scalable, achieve embarrassingly low rates of processing per node. We feel that current DISC systems have repeated a mistake of old high-performance systems: focusing on scalability without considering efficiency. This poor efficiency comes with issues in reliability, energy, and cost. As the gap between theoretical performance and what is actually achieved has become glaringly large, we feel there is a pressing need to rethink the design of future data intensive computing and carefully consider the direction of future research.