Massive structured data management solution

  • Authors:
  • Ullas Nambiar;Rajeev Gupta;Himanshu Gupta;Mukesh Mohania

  • Affiliations:
  • IBM Research India, New Delhi, India;IBM Research India, New Delhi, India;IBM Research India, New Delhi, India;IBM Research India, New Delhi, India

  • Venue:
  • CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The need to analyze structured data for various business intelligence applications such as customer churn analysis, social network analysis, etc. is well known. However, the potential size to which such data will scale in future will make solutions that revolve around data warehouses hard to scale. We begin by presenting a business case that prompted us to look at building a distributed analytics platform that is leveraging the MapReduce framework pioneered by Google. We present the results of the study and highlight issues with the current structured data access techniques for MapReduce platforms. Finally, we present a distributed and scalable data platform that leverages Apache Hadoop to enable business analysts to seamlessly query archived data along with data stored in the warehouse.