An integrated course on parallel and distributed processing
SIGCSE '98 Proceedings of the twenty-ninth SIGCSE technical symposium on Computer science education
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
The ITC distributed file system: principles and design
Proceedings of the tenth ACM symposium on Operating systems principles
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Building Nutch: Open Source Search
Queue - Search Engines
Designing a runtime system for volunteer computing
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
An easy to use distributed computing framework
Proceedings of the 38th SIGCSE technical symposium on Computer science education
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Hadoop at home: large-scale computing at a small college
Proceedings of the 40th ACM technical symposium on Computer science education
Seattle: a platform for educational cloud computing
Proceedings of the 40th ACM technical symposium on Computer science education
Virtualized games for teaching about distributed systems
Proceedings of the 40th ACM technical symposium on Computer science education
Teaching about threading: where and what?
ACM SIGACT News
Teaching large scale data processing: the five-week course and two years' experiences
SCE '08 Proceedings of the 1st ACM Summit on Computing Education in China on First ACM Summit on Computing Education in China
Towards Efficient MapReduce Using MPI
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Exploring large-data issues in the curriculum: a case study with MapReduce
TeachCL '08 Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics
Proceedings of the 41st ACM technical symposium on Computer science education
Automated control for elastic storage
Proceedings of the 7th international conference on Autonomic computing
Experiences teaching MapReduce in the cloud
Proceedings of the 43rd ACM technical symposium on Computer Science Education
Using clouds for MapReduce measurement assignments
ACM Transactions on Computing Education (TOCE)
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
In this paper we present the design of a modern course in cluster computing and large-scale data processing. The defining differences between this and previously published designs are its focus on processing very large data sets and its use of Hadoop, an open source Java-based implementation of MapReduce and the Google File System as the platform for programming exercises. Hadoop proved to be a key element for successfully implementing structured lab activities and independent design projects. Through this course, offered at the University of Washington in 2007, we imparted new skills on our students, improving their ability to design systems capable of solving web-scale problems.