Map-reduce extensions and recursive queries
Proceedings of the 14th International Conference on Extending Database Technology
ASTERIX: towards a scalable, semistructured data platform for evolving-world models
Distributed and Parallel Databases
Cluster computing, recursion and datalog
Datalog'10 Proceedings of the First international conference on Datalog Reloaded
The HaLoop approach to large-scale iterative data analysis
The VLDB Journal — The International Journal on Very Large Data Bases
Inside "Big Data management": ogres, onions, or parfaits?
Proceedings of the 15th International Conference on Extending Database Technology
An optimization framework for map-reduce queries
Proceedings of the 15th International Conference on Extending Database Technology
Transitive closure and recursive Datalog implemented on clusters
Proceedings of the 15th International Conference on Extending Database Technology
Adaptive MapReduce using situation-aware mappers
Proceedings of the 15th International Conference on Extending Database Technology
Improving online aggregation performance for skewed data distribution
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Integrating open government data with stratosphere for more transparency
Web Semantics: Science, Services and Agents on the World Wide Web
Massively-parallel stream processing under QoS constraints with Nephele
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Big data platforms: What's next?
XRDS: Crossroads, The ACM Magazine for Students - Big Data
ASTERIX: scalable warehouse-style web data integration
Proceedings of the Ninth International Workshop on Information Integration on the Web
Early accurate results for advanced analytics on MapReduce
Proceedings of the VLDB Endowment
The seven deadly sins of cloud computing research
HotCloud'12 Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing
Opening the black boxes in data flow optimization
Proceedings of the VLDB Endowment
Spinning fast iterative data flows
Proceedings of the VLDB Endowment
REX: recursive, delta-based data-centric computation
Proceedings of the VLDB Endowment
The MADlib analytics library: or MAD skills, the SQL
Proceedings of the VLDB Endowment
ASTERIX: an open source system for "Big Data" management and analysis (demo)
Proceedings of the VLDB Endowment
SCOPE: parallel databases meet MapReduce
The VLDB Journal — The International Journal on Very Large Data Bases
Optimizing large-scale Semi-Naïve datalog evaluation in hadoop
Datalog 2.0'12 Proceedings of the Second international conference on Datalog in Academia and Industry
Sparkler: supporting large-scale matrix factorization
Proceedings of the 16th International Conference on Extending Database Technology
Shark: SQL and rich analytics at scale
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
A bloat-aware design for big data applications
Proceedings of the 2013 international symposium on memory management
Large-scale computation not at the cost of expressiveness
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
"All roads lead to Rome": optimistic recovery for distributed iterative data processing
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Revisiting aggregation techniques for big data
Proceedings of the sixteenth international workshop on Data warehousing and OLAP
The family of mapreduce and large-scale data processing systems
ACM Computing Surveys (CSUR)
Scalable lineage capture for debugging DISC analytics
Proceedings of the 4th annual Symposium on Cloud Computing
Pregelix: dataflow-based big graph analytics
Proceedings of the 4th annual Symposium on Cloud Computing
Continuous cloud-scale query optimization and processing
Proceedings of the VLDB Endowment
Piranha: optimizing short jobs in Hadoop
Proceedings of the VLDB Endowment
REEF: retainable evaluator execution framework
Proceedings of the VLDB Endowment
Scalable topic-specific influence analysis on microblogs
Proceedings of the 7th ACM international conference on Web search and data mining
Nephele streaming: stream processing under QoS constraints at scale
Cluster Computing
Hi-index | 0.00 |
Hyracks is a new partitioned-parallel software platform designed to run data-intensive computations on large shared-nothing clusters of computers. Hyracks allows users to express a computation as a DAG of data operators and connectors. Operators operate on partitions of input data and produce partitions of output data, while connectors repartition operators' outputs to make the newly produced partitions available at the consuming operators. We describe the Hyracks end user model, for authors of dataflow jobs, and the extension model for users who wish to augment Hyracks' built-in library with new operator and/or connector types. We also describe our initial Hyracks implementation. Since Hyracks is in roughly the same space as the open source Hadoop platform, we compare Hyracks with Hadoop experimentally for several different kinds of use cases. The initial results demonstrate that Hyracks has significant promise as a next-generation platform for data-intensive applications.