Massive structured data management solution

Authors:
Ullas Nambiar;Rajeev Gupta;Himanshu Gupta;Mukesh Mohania
Affiliations:
IBM Research India, New Delhi, India;IBM Research India, New Delhi, India;IBM Research India, New Delhi, India;IBM Research India, New Delhi, India
Venue:
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Year:
2010

Citing 5
Cited 0

An array-based algorithm for simultaneous multidimensional aggregates

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
DynaMat: a dynamic view management system for data warehouses

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
A case for fractured mirrors

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
A comparison of approaches to large-scale data analysis

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data

Quantified Score

Hi-index	0.00

Visualization

Abstract

The need to analyze structured data for various business intelligence applications such as customer churn analysis, social network analysis, etc. is well known. However, the potential size to which such data will scale in future will make solutions that revolve around data warehouses hard to scale. We begin by presenting a business case that prompted us to look at building a distributed analytics platform that is leveraging the MapReduce framework pioneered by Google. We present the results of the study and highlight issues with the current structured data access techniques for MapReduce platforms. Finally, we present a distributed and scalable data platform that leverages Apache Hadoop to enable business analysts to seamlessly query archived data along with data stored in the warehouse.