A scalable distributed information management system

Authors:
Praveen Yalagandula;Mike Dahlin
Affiliations:
The University of Texas at Austin, Austin, TX;The University of Texas at Austin, Austin, TX
Venue:
Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
Year:
2004

Citing 30
Cited 80

Accessing nearby copies of replicated objects in a distributed environment

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Flexible update propagation for weakly consistent replication

Proceedings of the sixteenth ACM symposium on Operating systems principles
Resource containers: a new facility for resource management in server systems

OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
The network weather service: a distributed resource performance forecasting service for metacomputing

Future Generation Computer Systems - Special issue on metacomputing
Directed diffusion: a scalable and robust communication paradigm for sensor networks

MobiCom '00 Proceedings of the 6th annual international conference on Mobile computing and networking
Space/time trade-offs in hash coding with allowable errors

Communications of the ACM
Bayeux: an architecture for scalable and fault-tolerant wide-area data dissemination

NOSSDAV '01 Proceedings of the 11th international workshop on Network and operating systems support for digital audio and video
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Bandwidth constrained placement in a WAN

Proceedings of the twentieth annual ACM symposium on Principles of distributed computing
Offering a Precision-Performance Tradeoff for Aggregation Queries over Replicated Data

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Kademlia: A Peer-to-Peer Information System Based on the XOR Metric

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Serving DNS Using a Peer-to-Peer Lookup Service

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Routing Algorithms for DHTs: Some Open Questions

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
Application-Level Multicast Using Content-Addressable Networks

NGC '01 Proceedings of the Third International COST264 Workshop on Networked Group Communication
Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining

ACM Transactions on Computer Systems (TOCS)
The impact of DHT routing geometry on resilience and proximity

Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and

Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and
SHARP: an architecture for secure resource peering

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
SplitStream: high-bandwidth multicast in cooperative environments

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Bitmap algorithms for counting active flows on high speed links

Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
TAG: a Tiny AGgregation service for Ad-Hoc sensor networks

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
InfoSpect: using a logic language for system health monitoring in distributed systems

EW 10 Proceedings of the 10th workshop on ACM SIGOPS European workshop
Processes in KaffeOS: isolation, resource management, and sharing in java

OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
SkipNet: a scalable overlay network with practical locality properties

USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Querying the internet with PIER

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
The surprising power of epidemic communication

Future directions in distributed computing
The potential costs and benefits of long-term prefetching for content distribution

Computer Communications
Scribe: a large-scale and decentralized application-level multicast infrastructure

IEEE Journal on Selected Areas in Communications

A case study in building layered DHT applications

Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
Gossip-based aggregation in large dynamic networks

ACM Transactions on Computer Systems (TOCS)
Location based placement of whole distributed systems

CoNEXT '05 Proceedings of the 2005 ACM conference on Emerging network experiment and technology
INSIGHT: a distributed monitoring system for tracking continuous queries

Proceedings of the twentieth ACM symposium on Operating systems principles
A need for componentized transport protocols

Proceedings of the twentieth ACM symposium on Operating systems principles
S3: a scalable sensing service for monitoring large networked systems

Proceedings of the 2006 SIGCOMM workshop on Internet network management
Delay aware querying with seaweed

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Experiences in building and operating ePOST, a reliable peer-to-peer application

Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Survey of research towards robust peer-to-peer networks: search methods

Computer Networks: The International Journal of Computer and Telecommunications Networking
Availability of multi-object operations

NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Optimal proactive caching in peer-to-peer network: analysis and application

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
STAR: self-tuning aggregation for scalable monitoring

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Supporting self-organization for hybrid grid resource scheduling

Proceedings of the 2008 ACM symposium on Applied computing
San Fermín: aggregating large data sets using a binomial swap forest

NSDI'08 Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation
Towards a model of computer systems research

WOWCS'08 Proceedings of the conference on Organizing Workshops, Conferences, and Symposia for Computer Systems
Wide-scale data stream management

ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Robust aggregation in peer-to-peer database systems

IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
Distributed hash sketches: Scalable, efficient, and accurate cardinality estimation for distributed multisets

ACM Transactions on Computer Systems (TOCS)
Moara: flexible and scalable group-based querying system

Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware
Resource Discovery Techniques in Distributed Desktop Grid Environments

GRID '06 Proceedings of the 7th IEEE/ACM International Conference on Grid Computing
PARDA: proportional allocation of resources for distributed storage access

FAST '09 Proccedings of the 7th conference on File and storage technologies
Efficient on-demand operations in dynamic distributed infrastructures

LADIS '08 Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware
Conference reviewing considered harmful

ACM SIGOPS Operating Systems Review
A Partition-Based Broadcast Algorithm over DHT for Large-Scale Computing Infrastructures

GPC '09 Proceedings of the 4th International Conference on Advances in Grid and Pervasive Computing
Self-correlating predictive information tracking for large-scale production systems

ICAC '09 Proceedings of the 6th international conference on Autonomic computing
On the treeness of internet latency and bandwidth

Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Toward a cloud computing research agenda

ACM SIGACT News
AVCOL: Availability-aware information aggregation in large distributed systems under uncollaborative behavior

Computer Networks: The International Journal of Computer and Telecommunications Networking
Towards an architecture for service deployment in contributory communities

International Journal of Grid and Utility Computing
Dynamic Query Processing for P2P Data Services in the Cloud

DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
A peer-to-peer IO buffering service based on RAM-grid

International Journal of Autonomous and Adaptive Communications Systems
DHT-based lightweight broadcast algorithms in large-scale computing infrastructures

Future Generation Computer Systems
Statistical structures for Internet-scale data management

The VLDB Journal — The International Journal on Very Large Data Bases
Consistency of States of Management Data in P2P-Based Autonomic Network Management

DSOM '09 Proceedings of the 20th IFIP/IEEE International Workshop on Distributed Systems: Operations and Management: Integrated Management of Systems, Services, Processes and People in IT
Elaborating a decentralized market information system

OTM'07 Proceedings of the 2007 OTM confederated international conference on On the move to meaningful internet systems - Volume Part I
Load-balanced query dissemination in privacy-aware online communities

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Monalytics: online monitoring and analytics for managing large scale data centers

Proceedings of the 7th international conference on Autonomic computing
Enabling routing control in a DHT

IEEE Journal on Selected Areas in Communications
Peer-to-peer systems

Communications of the ACM
Scaling a monitoring infrastructure for the Akamai network

ACM SIGOPS Operating Systems Review
Gossip-based distribution estimation in peer-to-peer networks

IPTPS'08 Proceedings of the 7th international conference on Peer-to-peer systems
Network imprecision: a new consistency metric for scalable monitoring

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
A self-organization mechanism based on cross-entropy method for P2P-like applications

ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Resource adaptive distributed information sharing

EUNICE'10 Proceedings of the 16th EUNICE/IFIP WG 6.6 conference on Networked services and applications: engineering, control and management
Keeping track of 70,000+ servers: the akamai query system

LISA'10 Proceedings of the 24th international conference on Large installation system administration
An efficient management and automatic failover on a large-scale cluster monitoring system

ICOSSSE '09 Proceedings of the 8th WSEAS international conference on System science and simulation in engineering
A hybrid approach for estimating document frequencies in unstructured P2P networks

Information Systems
SAAR: a shared control plane for overlay multicast

NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Friday: global comprehension for distributed replay

NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Peer-to-peer web search: euphoria, achievements, disillusionment, and future opportunities

From active data management to event-based systems and more
OLIC: online information compression for scalable hosting infrastructure monitoring

Proceedings of the Nineteenth International Workshop on Quality of Service
A flexible architecture integrating monitoring and analytics for managing large-scale data centers

Proceedings of the 8th ACM international conference on Autonomic computing
In-situ MapReduce for log processing

USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Anonygator: privacy and integrity preserving data aggregation

Proceedings of the ACM/IFIP/USENIX 11th International Conference on Middleware
A distributed full-text top-k document dissemination system in distributed hash tables

World Wide Web
Network-aware summarisation for resource discovery in P2P-content networks

Future Generation Computer Systems
STAIRS: Towards efficient full-text filtering and dissemination in DHT environments

The VLDB Journal — The International Journal on Very Large Data Bases
Distributed network querying with bounded approximate caching

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Database-centric programming for wide-area sensor systems

DCOSS'05 Proceedings of the First IEEE international conference on Distributed Computing in Sensor Systems
Design of adaptive overlays for multi-scale communication in sensor networks

DCOSS'05 Proceedings of the First IEEE international conference on Distributed Computing in Sensor Systems
NetProfiler: profiling wide-area networks using peer cooperation

IPTPS'05 Proceedings of the 4th international conference on Peer-to-Peer Systems
Willow: DHT, aggregation, and publish/subscribe in one protocol

IPTPS'04 Proceedings of the Third international conference on Peer-to-Peer Systems
Cloud-scale resource management: challenges and techniques

HotCloud'11 Proceedings of the 3rd USENIX conference on Hot topics in cloud computing
In-situ MapReduce for log processing

HotCloud'11 Proceedings of the 3rd USENIX conference on Hot topics in cloud computing
Secure Distributed Data Aggregation

Foundations and Trends in Databases
Processing flows of information: From data stream to complex event processing

ACM Computing Surveys (CSUR)
Benchmarking decentralized monitoring mechanisms in peer-to-peer systems

ICPE '12 Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering
Camdoop: exploiting in-network aggregation for big data applications

NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Self-adaptive approximate queries for large-scale information aggregation

International Journal of Web and Grid Services
The XtreemOS Resource Selection Service

ACM Transactions on Autonomous and Adaptive Systems (TAAS) - Special Section: Extended Version of SASO 2011 Best Paper
ASIA: application-specific integrated aggregation for publish/subscribe middleware

Proceedings of the Posters and Demo Track
A decentralized approach for mining event correlations in distributed system monitoring

Journal of Parallel and Distributed Computing
VScope: middleware for troubleshooting time-sensitive data center applications

Proceedings of the 13th International Middleware Conference
Aggregation for implicit invocations

Proceedings of the 12th annual international conference on Aspect-oriented software development
A task routing approach to large-scale scheduling

Future Generation Computer Systems
Parametric Content-Based Publish/Subscribe

ACM Transactions on Computer Systems (TOCS)
MatchTree: Flexible, scalable, and fault-tolerant wide-area resource discovery with distributed matchmaking and aggregation

Future Generation Computer Systems
Autonomic cloud resource sharing for intercloud federations

Future Generation Computer Systems
Performance troubleshooting in data centers: an annotated bibliography?

ACM SIGOPS Operating Systems Review
Decentralized monitoring in peer-to-peer systems

Benchmarking Peer-to-Peer Systems

Quantified Score

Hi-index	0.02

Visualization

Abstract

We present a Scalable Distributed Information Management System (SDIMS) that aggregates information about large-scale networked systems and that can serve as a basic building block for a broad range of large-scale distributed applications by providing detailed views of nearby information and summary views of global information. To serve as a basic building block, a SDIMS should have four properties: scalability to many nodes and attributes, flexibility to accommodate a broad range of applications, administrative isolation for security and availability, and robustness to node and network failures. We design, implement and evaluate a SDIMS that (1) leverages Distributed Hash Tables (DHT) to create scalable aggregation trees, (2) provides flexibility through a simple API that lets applications control propagation of reads and writes, (3) provides administrative isolation through simple extensions to current DHT algorithms, and (4) achieves robustness to node and network reconfigurations through lazy reaggregation, on-demand reaggregation, and tunable spatial replication. Through extensive simulations and micro-benchmark experiments, we observe that our system is an order of magnitude more scalable than existing approaches, achieves isolation properties at the cost of modestly increased read latency in comparison to flat DHTs, and gracefully handles failures.