Distributed programming in Argus
Communications of the ACM
Foundations of parallel programming
Foundations of parallel programming
Principles of distributed database systems (2nd ed.)
Principles of distributed database systems (2nd ed.)
The state of the art in locally distributed Web-server systems
ACM Computing Surveys (CSUR)
Transaction Processing: Concepts and Techniques
Transaction Processing: Concepts and Techniques
Notes on Data Base Operating Systems
Operating Systems, An Advanced Course
Two-Layer Transaction Management for Workflow Management Applications
DEXA '97 Proceedings of the 8th International Conference on Database and Expert Systems Applications
JDBC API Tutorial and Reference
JDBC API Tutorial and Reference
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Distributed Systems: Concepts and Design (4th Edition) (International Computer Science)
Distributed Systems: Concepts and Design (4th Edition) (International Computer Science)
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Sprint: a middleware for high-performance transaction processing
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
The end of an architectural era: (it's time for a complete rewrite)
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
An architecture for recycling intermediates in a column-store
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Building a high-level dataflow system on top of Map-Reduce: the Pig experience
Proceedings of the VLDB Endowment
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Distributed and fault-tolerant execution framework for transaction processing
Proceedings of the 4th Annual International Conference on Systems and Storage
Hi-index | 0.00 |
A scale-out system is a cluster of commodity machines, and offers a good platform to support steadily increasing workloads that process growing data sets. Sharding [4] is a method of partitioning data and processing a computation on a scale-out system. In a database system, a large table can be partitioned into small tables so each node can process its part of the computation. The sharding approach in a large batch transaction processing, which is important in financial area, presents two hard problems to programmers. Programmers have to write complex code (1) to transfer the input data so as to align the computations with the data partitions, and (2) to manage the distributed transactions. This paper presents a new parallel programming framework that makes parallel transactional programming easier by specifying transaction scopes and partitioners to simplify the code. Transaction scopes include series of subtransactions, each of which performs local operations. The system manages the distributed transactions automatically. A partitioner represents how the computation should be decomposed and aligned with the data partitions to avoid remote database accesses. Between paired of subtransactions, the system handles the data shuffling across the network. We implemented our parallel programming framework as a new Java class library. We hide all of the complex details of data transfer and distributed transaction management in the library. Our programming framework can eliminate almost 66% of the lines of code compared to a current programming approach without programming framework support. We also confirmed good scalability, with a scaling factor of 20.6 on 24 nodes using our modified batch program for the TPC-C benchmark.