A stochastic approach for clustering in object bases
SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Parallel database systems: the future of high performance database systems
Communications of the ACM
C4.5: programs for machine learning
C4.5: programs for machine learning
Partitioning similarity graphs: a framework for declustering problems
Information Systems
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs
SIAM Journal on Scientific Computing
Automating physical database design in a parallel database
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Hybrid-Range Partitioning Strategy: A New Declustering Strategy for Multiprocessor Database Machines
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Physical database design decision algorithms and concurrent reorganization for parallel database systems
Integrating vertical and horizontal partitioning into automated physical database design
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Multi.Objective Hypergraph Partitioning Algorithms for Cut and Maximum Subdomain Degree Minimization
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Iterative-improvement-based declustering heuristics for multi-disk databases
Information Systems
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
The end of an architectural era: (it's time for a complete rewrite)
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
PNUTS: Yahoo!'s hosted data serving platform
Proceedings of the VLDB Endowment
Graph partitioning using single commodity flows
Journal of the ACM (JACM)
Oracle Database 11g New Features
Oracle Database 11g New Features
Controversial users demand local trust metrics: an experimental study on Epinions.com community
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Benchmarking cloud serving systems with YCSB
Proceedings of the 1st ACM symposium on Cloud computing
Data-oriented transaction execution
Proceedings of the VLDB Endowment
Automated partitioning design in parallel database systems
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
A data-oriented transaction execution engine and supporting tools
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Partitioning social networks for time-dependent queries
Proceedings of the 4th Workshop on Social Network Systems
How to efficiently snapshot transactional data: hardware or software controlled?
Proceedings of the Seventh International Workshop on Data Management on New Hardware
Latency-optimal walks in replicated and partitioned graphs
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications
Database scalability, elasticity, and autonomy in the cloud
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
PLP: page latch-free shared-everything OLTP
Proceedings of the VLDB Endowment
ActiveSLA: a profit-oriented admission control framework for database-as-a-service providers
Proceedings of the 2nd ACM Symposium on Cloud Computing
Intrusion recovery for database-backed web applications
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
On predictive modeling for optimizing transaction execution in parallel OLTP systems
Proceedings of the VLDB Endowment
A workload-aware approach for optimizing the XML schema design trade-off
Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services
The evolving landscape of data management in the cloud
International Journal of Computational Science and Engineering
Scalable load balancing in cluster storage systems
Middleware'11 Proceedings of the 12th ACM/IFIP/USENIX international conference on Middleware
Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Towards effective partition management for large graphs
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
PEACOD: a platform for evaluation and comparison of database partitioning schemes
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part II
Mobius: unified messaging and data serving for mobile apps
Proceedings of the 10th international conference on Mobile systems, applications, and services
Graph data partition models for online social networks
Proceedings of the 23rd ACM conference on Hypertext and social media
Scalable Join Queries in Cloud Data Stores
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
LogBase: a scalable log-structured database system in the cloud
Proceedings of the VLDB Endowment
Executing web application queries on a partitioned database
WebApps'12 Proceedings of the 3rd USENIX conference on Web Application Development
Proceedings of the VLDB Endowment
Transaction processing using thread-to-metadata
Proceedings of the 16th International Database Engineering & Applications Sysmposium
Being picky: processing top-k queries with set-defined selections
Proceedings of the 21st ACM international conference on Information and knowledge management
Scalable load balancing in cluster storage systems
Proceedings of the 12th International Middleware Conference
Ursa: Scalable Load and Power Management in Cloud Storage Systems
ACM Transactions on Storage (TOS)
ElasTraS: An elastic, scalable, and self-managing transactional database for the cloud
ACM Transactions on Database Systems (TODS)
SWORD: scalable workload-aware data placement for transactional workloads
Proceedings of the 16th International Conference on Extending Database Technology
Scalable and dynamically balanced shared-everything OLTP with physiological partitioning
The VLDB Journal — The International Journal on Very Large Data Bases
MeT: workload aware elasticity for NoSQL
Proceedings of the 8th ACM European Conference on Computer Systems
A vision for personalized service level agreements in the cloud
Proceedings of the Second Workshop on Data Analytics in the Cloud
On the necessity of model checking NoSQL database schemas when building SaaS applications
Proceedings of the 2013 International Workshop on Testing the Cloud
A comparison of two physical data designs for interactive social networking actions
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Proceedings of the 17th International Database Engineering & Applications Symposium
Analysis of partitioning strategies for graph processing in bulk synchronous parallel models
Proceedings of the fifth international workshop on Cloud data management
bCATE: a balanced contention-aware transaction execution model for highly concurrent OLTP systems
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Automated data partitioning for independent distributed transactions
Proceedings Demo & Poster Track of ACM/IFIP/USENIX International Middleware Conference
Eliminating unscalable communication in transaction processing
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
We present Schism, a novel workload-aware approach for database partitioning and replication designed to improve scalability of shared-nothing distributed databases. Because distributed transactions are expensive in OLTP settings (a fact we demonstrate through a series of experiments), our partitioner attempts to minimize the number of distributed transactions, while producing balanced partitions. Schism consists of two phases: i) a workload-driven, graph-based replication/partitioning phase and ii) an explanation and validation phase. The first phase creates a graph with a node per tuple (or group of tuples) and edges between nodes accessed by the same transaction, and then uses a graph partitioner to split the graph into k balanced partitions that minimize the number of cross-partition transactions. The second phase exploits machine learning techniques to find a predicate-based explanation of the partitioning strategy (i.e., a set of range predicates that represent the same replication/partitioning scheme produced by the partitioner). The strengths of Schism are: i) independence from the schema layout, ii) effectiveness on n-to-n relations, typical in social network databases, iii) a unified and fine-grained approach to replication and partitioning. We implemented and tested a prototype of Schism on a wide spectrum of test cases, ranging from classical OLTP workloads (e.g., TPC-C and TPC-E), to more complex scenarios derived from social network websites (e.g., Epinions.com), whose schema contains multiple n-to-n relationships, which are known to be hard to partition. Schism consistently outperforms simple partitioning schemes, and in some cases proves superior to the best known manual partitioning, reducing the cost of distributed transactions up to 30%.