The Crystal Multicomputer: Design and Implementation Experience
IEEE Transactions on Software Engineering
Parallel database systems: the future of high performance database systems
Communications of the ACM
GAMMA - A High Performance Dataflow Database Machine
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
RAP. 2 - an Associative Processor for data bases
ISCA '78 Proceedings of the 5th annual symposium on Computer architecture
Database support for matching: limitations and opportunities
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Data driven workflow planning in cluster management systems
Proceedings of the 16th international symposium on High performance distributed computing
Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
CASSM: a cellular system for very large data bases
VLDB '75 Proceedings of the 1st International Conference on Very Large Data Bases
Direct A Multiprocessor Organization for Supporting Relational Database Management Systems
IEEE Transactions on Computers
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A Vision for Next Generation Query Processors and an Associated Research Agenda
Globe '09 Proceedings of the 2nd International Conference on Data Management in Grid and Peer-to-Peer Systems
Scaling-Up and Speeding-Up Video Analytics Inside Database Engine
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Extend UDF Technology for Integrated Analytics
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Efficiently support MapReduce-like computation models inside parallel DBMS
IDEAS '09 Proceedings of the 2009 International Database Engineering & Applications Symposium
Operational BI platform for video analytics
Proceedings of the International Conference on Management of Emergent Digital EcoSystems
The design of distributed real-time video analytic system
Proceedings of the first international workshop on Cloud data management
Cooperating SQL Dataflow Processes for In-DB Analytics
OTM '09 Proceedings of the Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009 on On the Move to Meaningful Internet Systems: Part I
Optimizing joins in a map-reduce environment
Proceedings of the 13th International Conference on Extending Database Technology
A common substrate for cluster computing
HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
Scalable clustering algorithm for N-body simulations in a shared-nothing cluster
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
SFL: a structured dataflow language based on SQL and FP
DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part I
Generalized UDF for analytics inside database engine
WAIM'10 Proceedings of the 11th international conference on Web-age information management
Scale out parallel and distributed CDR stream analytics
Globe'10 Proceedings of the Third international conference on Data management in grid and peer-to-peer systems
The performance of MapReduce: an in-depth study
Proceedings of the VLDB Endowment
Data stream analytics as cloud service for mobile applications
OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems: Part II
The case for object databases in cloud data management
ICOODB'10 Proceedings of the Third international conference on Objects and databases
Continuous mapreduce for In-DB stream analytics
OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems
Map-reduce extensions and recursive queries
Proceedings of the 14th International Conference on Extending Database Technology
Experience in Continuous analytics as a Service (CaaaS)
Proceedings of the 14th International Conference on Extending Database Technology
Architectural Requirements for Cloud Computing Systems: An Enterprise Cloud Approach
Journal of Grid Computing
The fix-point method for discrete events simulation using SQL and UDF
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
Continuous access to cloud event services with event pipe queries
OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems - Volume Part II
Extend core UDF framework for GPU-enabled analytical query evaluation
Proceedings of the 15th Symposium on International Database Engineering & Applications
Parallel data processing with MapReduce: a survey
ACM SIGMOD Record
Case study of scientific data processing on a cloud using hadoop
HPCS'09 Proceedings of the 23rd international conference on High Performance Computing Systems and Applications
Cluster computing, recursion and datalog
Datalog'10 Proceedings of the First international conference on Datalog Reloaded
The HaLoop approach to large-scale iterative data analysis
The VLDB Journal — The International Journal on Very Large Data Bases
SkewTune: mitigating skew in mapreduce applications
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Transitive closure and recursive Datalog implemented on clusters
Proceedings of the 15th International Conference on Extending Database Technology
Stream-join revisited in the context of epoch-based SQL continuous query
Proceedings of the 16th International Database Engineering & Applications Sysmposium
SkewTune in action: mitigating skew in MapReduce applications
Proceedings of the VLDB Endowment
Large-scale computation not at the cost of expressiveness
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
Distributed data management using MapReduce
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
This paper introduces Clustera, an integrated computation and data management system. In contrast to traditional cluster-management systems that target specific types of workloads, Clustera is designed for extensibility, enabling the system to be easily extended to handle a wide variety of job types ranging from computationally-intensive, long-running jobs with minimal I/O requirements to complex SQL queries over massive relational tables. Another unique feature of Clustera is the way in which the system architecture exploits modern software building blocks including application servers and relational database systems in order to realize important performance, scalability, portability and usability benefits. Finally, experimental evaluation suggests that Clustera has good scale-up properties for SQL processing, that Clustera delivers performance comparable to Hadoop for MapReduce processing and that Clustera can support higher job throughput rates than previously published results for the Condor and CondorJ2 batch computing systems.