Web-scale computer vision using MapReduce for multimedia data mining
Proceedings of the Tenth International Workshop on Multimedia Data Mining
Design patterns for efficient graph algorithms in MapReduce
Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Scheduling divisible MapReduce computations
Journal of Parallel and Distributed Computing
Counting triangles and the curse of the last reducer
Proceedings of the 20th international conference on World wide web
Social content matching in MapReduce
Proceedings of the VLDB Endowment
Fast personalized PageRank on MapReduce
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Filtering: a method for solving graph problems in MapReduce
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
On scheduling in map-reduce and flow-shops
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
PigSPARQL: mapping SPARQL to Pig Latin
Proceedings of the International Workshop on Semantic Web Information Management
A large scale distributed syntactic, semantic and lexical language model for machine translation
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Fast clustering using MapReduce
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
CloudVista: visual cluster exploration for extreme scale data in the cloud
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
No one (cluster) size fits all: automatic cluster sizing for data-intensive analytics
Proceedings of the 2nd ACM Symposium on Cloud Computing
A survey on question answering technology from an information retrieval perspective
Information Sciences: an International Journal
Legal document clustering with built-in topic segmentation
Proceedings of the 20th ACM international conference on Information and knowledge management
An implementation framework of mapreduce email social network analysis
Proceedings of the 6th ACM workshop on Wireless multimedia networking and computing
Introducing scalable quantum approaches in language representation
QI'11 Proceedings of the 5th international conference on Quantum interaction
Using Coq in specification and program extraction of hadoop mapreduce applications
SEFM'11 Proceedings of the 9th international conference on Software engineering and formal methods
Evaluating the suitability of mapreduce for surface temperature analysis codes
Proceedings of the second international workshop on Data intensive computing in the clouds
Parallel data processing with MapReduce: a survey
ACM SIGMOD Record
Multi-pass sorted neighborhood blocking with MapReduce
Computer Science - Research and Development
RDFPath: path query processing on large RDF graphs with mapreduce
ESWC'11 Proceedings of the 8th international conference on The Semantic Web
Partitioned multi-indexing: bringing order to social search
Proceedings of the 21st international conference on World Wide Web
Mr. LDA: a flexible large scale topic modeling package using variational inference in MapReduce
Proceedings of the 21st international conference on World Wide Web
Inner architecture of a social networking system
SOFSEM'12 Proceedings of the 38th international conference on Current Trends in Theory and Practice of Computer Science
RSQRT: An heuristic for estimating the number of clusters to report
Electronic Commerce Research and Applications
Digital Preservation in Grids and Clouds: A Middleware Approach
Journal of Grid Computing
V-SMART-join: a scalable mapreduce framework for all-pair similarity joins of multisets and vectors
Proceedings of the VLDB Endowment
Large-scale machine learning at twitter
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
An optimization framework for map-reduce queries
Proceedings of the 15th International Conference on Extending Database Technology
ESOP'12 Proceedings of the 21st European conference on Programming Languages and Systems
Distributed simulated annealing with mapreduce
EvoApplications'12 Proceedings of the 2012t European conference on Applications of Evolutionary Computation
MapReduce in MPI for Large-scale graph algorithms
Parallel Computing
Space-round tradeoffs for MapReduce computations
Proceedings of the 26th ACM international conference on Supercomputing
Finding and exploring memes in social media
Proceedings of the 23rd ACM conference on Hypertext and social media
Latent Community Topic Analysis: Integration of Community Discovery with Topic Modeling
ACM Transactions on Intelligent Systems and Technology (TIST)
MapReduce for parallel reinforcement learning
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Shortest paths in less than a millisecond
Proceedings of the 2012 ACM workshop on Workshop on online social networks
Graph-based ontology analysis in the linked open data
Proceedings of the 8th International Conference on Semantic Systems
AIMS'12 Proceedings of the 6th IFIP WG 6.6 international autonomous infrastructure, management, and security conference on Dependable Networks and Services
A scalable distributed syntactic, semantic, and lexical language model
Computational Linguistics
HadoopPerceptron: a toolkit for distributed perceptron training and prediction with MapReduce
EACL '12 Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Multimedia Applications and Security in MapReduce: Opportunities and Challenges
Concurrency and Computation: Practice & Experience
A distributed index for efficient parallel top-k keyword search on massive graphs
Proceedings of the twelfth international workshop on Web information and data management
Using mapreduce to scale events correlation discovery for business processes mining
BPM'12 Proceedings of the 10th international conference on Business Process Management
Computing scientometrics in large-scale academic search engines with mapreduce
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Parallel approaches to machine learning-A comprehensive survey
Journal of Parallel and Distributed Computing
Scalable parallel computing on clouds using Twister4Azure iterative MapReduce
Future Generation Computer Systems
UCAmI'12 Proceedings of the 6th international conference on Ubiquitous Computing and Ambient Intelligence
Review of Information Retrieval By Buettcher, Clarke, Cormack
ACM SIGACT News
Computing n-gram statistics in MapReduce
Proceedings of the 16th International Conference on Extending Database Technology
A performance comparison of parallel DBMSs and MapReduce on large-scale text analytics
Proceedings of the 16th International Conference on Extending Database Technology
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Using Pregel-like Large Scale Graph Processing Frameworks for Social Network Analysis
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Scaling big data mining infrastructure: the twitter experience
ACM SIGKDD Explorations Newsletter
WTF: the who to follow service at Twitter
Proceedings of the 22nd international conference on World Wide Web
Data-Intensive Cloud Computing: Requirements, Expectations, Challenges, and Solutions
Journal of Grid Computing
Proceedings of the 4th annual Symposium on Cloud Computing
Bisimulation reduction of big graphs on mapreduce
BNCOD'13 Proceedings of the 29th British National conference on Big Data
Hi-index | 0.00 |
Our world is being revolutionized by data-driven methods: access to large amounts of data has generated new insights and opened exciting new opportunities in commerce, science, and computing applications. Processing the enormous quantities of data necessary for these advances requires large clusters, making distributed computing paradigms more crucial than ever. MapReduce is a programming model for expressing distributed computations on massive datasets and an execution framework for large-scale data processing on clusters of commodity servers. The programming model provides an easy-to-understand abstraction for designing scalable algorithms, while the execution framework transparently handles many system-level details, ranging from scheduling to synchronization to fault tolerance. This book focuses on MapReduce algorithm design, with an emphasis on text processing algorithms common in natural language processing, information retrieval, and machine learning. We introduce the notion of MapReduce design patterns, which represent general reusable solutions to commonly occurring problems across a variety of problem domains. This book not only intends to help the reader "think in MapReduce", but also discusses limitations of the programming model as well. Table of Contents: Introduction / MapReduce Basics / MapReduce Algorithm Design / Inverted Indexing for Text Retrieval / Graph Algorithms / EM Algorithms for Text Processing / Closing Remarks