Handbook of mathematics (3rd ed.)
Handbook of mathematics (3rd ed.)
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Link prediction and path analysis using Markov chains
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Introduction to Algorithms
Evaluating collaborative filtering recommender systems
ACM Transactions on Information Systems (TOIS)
The link-prediction problem for social networks
Journal of the American Society for Information Science and Technology
Dynamic personalized pagerank in entity-relation graphs
Proceedings of the 16th international conference on World Wide Web
Evaluating MapReduce for Multi-core and Multiprocessor Systems
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Statistical properties of community structure in large social and information networks
Proceedings of the 17th international conference on World Wide Web
Video suggestion and discovery for youtube: taking random walks through the view graph
Proceedings of the 17th international conference on World Wide Web
MapReduce for Data Intensive Scientific Analyses
ESCIENCE '08 Proceedings of the 2008 Fourth IEEE International Conference on eScience
User interactions in social networks and their implications
Proceedings of the 4th ACM European conference on Computer systems
Scalable Collaborative Filtering Approaches for Large Recommender Systems
The Journal of Machine Learning Research
Pregel: a system for large-scale graph processing - "ABSTRACT"
Proceedings of the 28th ACM symposium on Principles of distributed computing
Power-Law Distributions in Empirical Data
SIAM Review
PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Stateful bulk processing for incremental analytics
Proceedings of the 1st ACM symposium on Cloud computing
Design patterns for efficient graph algorithms in MapReduce
Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Twister: a runtime for iterative MapReduce
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
Improving MapReduce performance in heterogeneous environments
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Spark: cluster computing with working sets
HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
Scripting the cloud with skywriting
HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
Asynchronous Algorithms in MapReduce
CLUSTER '10 Proceedings of the 2010 IEEE International Conference on Cluster Computing
HaLoop: efficient iterative data processing on large clusters
Proceedings of the VLDB Endowment
Large-scale incremental processing using distributed transactions and notifications
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Piccolo: building fast, distributed programs with partitioned tables
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
CIEL: a universal execution engine for distributed data-flow computing
Proceedings of the 8th USENIX conference on Networked systems design and implementation
PrIter: a distributed framework for prioritized iterative computations
Proceedings of the 2nd ACM Symposium on Cloud Computing
iMapReduce: A Distributed Computing Framework for Iterative Computation
IPDPSW '11 Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum
Windows Azure Platform
i2MapReduce: incremental iterative MapReduce
Proceedings of the 2nd International Workshop on Cloud Intelligence
The family of mapreduce and large-scale data processing systems
ACM Computing Surveys (CSUR)
Analysis of I/O Performance on an Amazon EC2 Cluster Compute and High I/O Platform
Journal of Grid Computing
Parallel processing of large graphs
Future Generation Computer Systems
Speeding-up codon analysis on the cloud with local MapReduce aggregation
Information Sciences: an International Journal
Hi-index | 0.00 |
Iterative computation is pervasive in many applications such as data mining, web ranking, graph analysis, online social network analysis, and so on. These iterative applications typically involve massive data sets containing millions or billions of data records. This poses demand of distributed computing frameworks for processing massive data sets on a cluster of machines. MapReduce is an example of such a framework. However, MapReduce lacks built-in support for iterative process that requires to parse data sets iteratively. Besides specifying MapReduce jobs, users have to write a driver program that submits a series of jobs and performs convergence testing at the client. This paper presents iMapReduce, a distributed framework that supports iterative processing. iMapReduce allows users to specify the iterative computation with the separated map and reduce functions, and provides the support of automatic iterative processing within a single job. More importantly, iMapReduce significantly improves the performance of iterative implementations by (1) reducing the overhead of creating new MapReduce jobs repeatedly, (2) eliminating the shuffling of static data, and (3) allowing asynchronous execution of map tasks. We implement an iMapReduce prototype based on Apache Hadoop, and show that iMapReduce can achieve up to 5 times speedup over Hadoop for implementing iterative algorithms.