ACM Transactions on Computer Systems (TOCS)
The design philosophy of the DARPA internet protocols
SIGCOMM '88 Symposium proceedings on Communications architectures and protocols
SIGCOMM '89 Symposium proceedings on Communications architectures & protocols
Fault detection in an Ethernet network using anomaly signature matching
SIGCOMM '93 Conference proceedings on Communications architectures, protocols and applications
Random early detection gateways for congestion avoidance
IEEE/ACM Transactions on Networking (TON)
Decentralised approaches for network management
ACM SIGCOMM Computer Communication Review
Using name-based mappings to increase hit rates
IEEE/ACM Transactions on Networking (TON)
HP Openview: A Manager's Guide
HP Openview: A Manager's Guide
A Generic Model for Fault Isolation in IntegratedManagement Systems
Journal of Network and Systems Management
The Effect of Detection and Restoration Times forError Recovery in Communication Networks
Journal of Network and Systems Management
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Distributed management by delegation
ICDCS '95 Proceedings of the 15th International Conference on Distributed Computing Systems
An architecture for inter-domain network troubleshooting
An architecture for inter-domain network troubleshooting
A unified framework for the negotiation and deployment of network services
WAC'04 Proceedings of the First international IFIP conference on Autonomic Communication
Semantic interoperability for an autonomic knowledge delivery service
WAC'05 Proceedings of the Second international IFIP conference on Autonomic Communication
Hi-index | 0.00 |
In this paper, we explore the constraints of a new problem: that of coordinating network troubleshooting among peer administrative domains and untrusted observers. Our approach permits any entity to report problems, whether it is a Network Operations Center (NOC), end-user, or application. Our goals are to define the inter-domain coordination problem clearly, and to develop an architecture which allows observers to report problems and receive timely feedback, regardless of their own locations and identities. By automating this process, we also relieve human bottlenecks at help desks and NOCs whenever possible. We present a troubleshooting approach for coordinating problem diagnosis, and describe Global Distributed Troubleshooting (GDT), a distributed protocol which realizes this approach. We show through simulation that GDT scales well as the number of observers and problems grows.