Location, location, location!: modeling data proximity in the cloud

Authors:
Birjodh Tiwana;Mahesh Balakrishnan;Marcos K. Aguilera;Hitesh Ballani;Z. Morley Mao
Affiliations:
University of Michigan, Ann Arbor, MI;Microsoft Research, Mountain View, CA;Microsoft Research, Mountain View, CA;Microsoft Research, Cambridge, UK;University of Michigan, Ann Arbor, MI
Venue:
Hotnets-IX Proceedings of the 9th ACM SIGCOMM Workshop on Hot Topics in Networks
Year:
2010

Citing 6
Cited 4

Vivaldi: a decentralized network coordinate system

Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
Meridian: a lightweight network location service without virtual coordinates

Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
iPlane: an information plane for distributed services

OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
On the treeness of internet latency and bandwidth

Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
PADS: a policy architecture for distributed storage systems

NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
Volley: automated data placement for geo-distributed cloud services

NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation

Forty data communications research questions

ACM SIGCOMM Computer Communication Review
A case for RDMA in clouds: turning supercomputer networking into commodity

Proceedings of the Second Asia-Pacific Workshop on Systems
The datacenter needs an operating system

HotCloud'11 Proceedings of the 3rd USENIX conference on Hot topics in cloud computing
Designing a Secure Cloud Architecture: The SeCA Model

International Journal of Information Security and Privacy

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cloud applications have increasingly come to rely on distributed storage systems that hide the complexity of handling network and node failures behind simple, data-centric interfaces (such as PUTs and GETs on key-value pairs). While these interfaces are very easy to use, the application is completely oblivious to the location of its data in the network; as a result, it has no way to optimize the placement of data or computation. In this paper, we propose exposing the network location of data to applications. The primary challenge is that data does not usually exist at a single point in the network; it can be striped, replicated, cached and coded across different locations, in arbitrary ways that vary across storage systems. For example, an item that is synchronously mirrored in both Seattle and London will appear equally far from both locations for writes, but equally close to both locations for reads. Accordingly, we describe Contour, a system that allows applications to query and manipulate the location of data without requiring them to be aware of the physical machines storing the data, the replication protocols used or the underlying network topology.