SQLMR: A Scalable Database Management System for Cloud Computing

Authors:
Meng-Ju Hsieh;Chao-Rui Chang;Li-Yung Ho;Jan-Jan Wu;Pangfeng Liu
Affiliations:
-;-;-;-;-
Venue:
ICPP '11 Proceedings of the 2011 International Conference on Parallel Processing
Year:
2011

Citing 0
Cited 1

On Providing DDL Support for a Relational Layer over a Document NoSQL Database

Proceedings of International Conference on Information Integration and Web-based Applications & Services

Quantified Score

Hi-index	0.00

Visualization

Abstract

As the size of data set in cloud increases rapidly, how to process large amount of data efficiently has become a critical issue. MapReduce provides a framework for large data processing and is shown to be scalable and fault-tolerant on commondity machines. However, it has higher learning curve than SQL-like language and the codes are hard to maintain and reuse. On the other hand, traditional SQL-based data processing is familiar to user but is limited in scalability. In this paper, we propose a hybrid approach to fill the gap between SQL-based and MapReduce data processing. We develop a data management system for cloud, named SQLMR. SQLMR complies SQL-like queries to a sequence of MapReduce jobs. Existing SQL-based applications are compatible seamlessly with SQLMR and users can manage Tera to PataByte scale of data with SQL-like queries instead of writing MapReduce codes. We also devise a number of optimization techniques to improve the performance of SQLMR. The experiment results demonstrate both performance and scalability advantage of SQLMR compared to MySQL and two NoSQL data processing systems, Hive and HadoopDB.