An array-based algorithm for simultaneous multidimensional aggregates
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
DynaMat: a dynamic view management system for data warehouses
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
A comparison of approaches to large-scale data analysis
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Hi-index | 0.00 |
The need to analyze structured data for various business intelligence applications such as customer churn analysis, social network analysis, etc. is well known. However, the potential size to which such data will scale in future will make solutions that revolve around data warehouses hard to scale. We begin by presenting a business case that prompted us to look at building a distributed analytics platform that is leveraging the MapReduce framework pioneered by Google. We present the results of the study and highlight issues with the current structured data access techniques for MapReduce platforms. Finally, we present a distributed and scalable data platform that leverages Apache Hadoop to enable business analysts to seamlessly query archived data along with data stored in the warehouse.