Algorithms for clustering data
Algorithms for clustering data
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Interpreting the data: Parallel analysis with Sawzall
Scientific Programming - Dynamic Grids and Worldwide Computing
Map-reduce-merge: simplified relational data processing on large clusters
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Evaluating MapReduce for Multi-core and Multiprocessor Systems
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Mars: a MapReduce framework on graphics processors
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
MapReduce for Data Intensive Scientific Analyses
ESCIENCE '08 Proceedings of the 2008 Fourth IEEE International Conference on eScience
Hive: a warehousing solution over a map-reduce framework
Proceedings of the VLDB Endowment
Improving MapReduce performance in heterogeneous environments
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Phoenix++: modular MapReduce for shared-memory systems
Proceedings of the second international workshop on MapReduce and its applications
Parallel data processing with MapReduce: a survey
ACM SIGMOD Record
MATE-EC2: a middleware for processing data with AWS
Proceedings of the 2011 ACM international workshop on Many task computing on grids and supercomputers
SciMATE: A Novel MapReduce-Like Framework for Multiple Scientific Data Formats
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Taming massive distributed datasets: data sampling using bitmap indices
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
HAT: history-based auto-tuning MapReduce in heterogeneous environments
The Journal of Supercomputing
Memory-efficient groupby-aggregate using compressed buffer trees
Proceedings of the 4th annual Symposium on Cloud Computing
Hone: "Scaling down" Hadoop on shared-memory systems
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Map-reduce framework has received a significant attention and is being used for programming both large-scale clusters and multi-core systems. While the high productivity aspect of map-reduce has been well accepted, it is not clear if the API results in efficient implementations for different subclasses of data-intensive applications. In this paper, we present a system MATE (Map-reduce with an Alternate API), that provides a high-level, but distinct API. Particularly, our API includes a programmer-managed reduction object, which results in lower memory requirements at runtime for many data-intensive applications. MATE implements this API on top of the Phoenix system, a multi-core map-reduce implementation from Stanford. We evaluate our system using three data mining applications, and compare its performance to that of both Phoenix and Hadoop. Our results show that for all the three applications, MATE outperforms Phoenix and Hadoop. Despite achieving good scalability, MATE also maintains the easy-to-use API of map-reduce. Overall, we argue that, our approach, which is based on the generalized reduction structure, provides an alternate high-level API, leading to more efficient and scalable implementations