Data-intensive text processing with MapReduce

Authors:
Jimmy Lin;Chris Dyer
Affiliations:
University of Maryland, College Park;University of Maryland, College Park
Venue:
NAACL-Tutorials '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Tutorial Abstracts
Year:
2009

Citing 2
Cited 3

MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Exploring large-data issues in the curriculum: a case study with MapReduce

TeachCL '08 Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics

Investigation of data locality and fairness in MapReduce

Proceedings of third international workshop on MapReduce and its Applications Date
MapReduce indexing strategies: Studying scalability and efficiency

Information Processing and Management: an International Journal
Exploiting and Evaluating MapReduce for Large-Scale Graph Mining

ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)

Quantified Score

Hi-index	0.00

Visualization

Abstract

This half-day tutorial introduces participants to data-intensive text processing with the MapReduce programming model [1], using the open-source Hadoop implementation. The focus will be on scalability and the tradeoffs associated with distributed processing of large datasets. Content will include general discussions about algorithm design, presentation of illustrative algorithms, case studies in HLT applications, as well as practical advice in writing Hadoop programs and running Hadoop clusters.