SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Hi-index | 0.00 |
The Facebook Data Infrastructure supports a wide range of applications, including both external facing products and services and internal applications. This paper focuses on the Data Warehousing and Analytics platform of Facebook that provides support for batch-oriented analytics applications. Facebook's data infrastructure is built largely on top of open-source technologies such as Apache Hadoop, HDFS, MapReduce and Hive, and provides a rich set of tools for different users to perform analytics queries on Facebook data. As the Facebook user base continues to grow, we continue to enhance our data platform in order to deal with the challenges of scaling with increasing amounts of data.