SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Scaling to very very large corpora for natural language disambiguation
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Cluster computing for web-scale data processing
Proceedings of the 39th SIGCSE technical symposium on Computer science education
Pairwise document similarity in large collections with MapReduce
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Fast, easy, and cheap: construction of statistical machine translation models with MapReduce
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Data-intensive text processing with MapReduce
NAACL-Tutorials '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Tutorial Abstracts
Fast, easy, and cheap: construction of statistical machine translation models with MapReduce
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Proceedings of the 19th international conference on World wide web
Prototyping an online wetland ecosystem services model using open model sharing standards
Environmental Modelling & Software
Hi-index | 0.00 |
This paper describes the design of a pilot research and educational effort at the University of Maryland centered around technologies for tackling Web-scale problems. In the context of a "cloud computing" initiative lead by Google and IBM, students and researchers are provided access to a computer cluster running Hadoop, an open-source Java implementation of Google's MapReduce framework. This technology provides an opportunity for students to explore large-data issues in the context of a course organized around teams of graduate and undergraduate students, in which they tackle open research problems in the human language technologies. This design represents one attempt to bridge traditional instruction with real-world, large-data research challenges.