BitVault: a highly reliable distributed data retention platform
ACM SIGOPS Operating Systems Review - Systems work at Microsoft Research
WiDS: an integrated toolkit for distributed system development
HOTOS'05 Proceedings of the 10th conference on Hot Topics in Operating Systems - Volume 10
Simulation-based development of Peer-to-Peer systems with the RealPeer methodology and framework
Journal of Systems Architecture: the EUROMICRO Journal
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
Developing, simulating, and deploying peer-to-peer systems using the Kompics component model
Proceedings of the Fourth International ICST Conference on COMmunication System softWAre and middlewaRE
Models and software model checking of a distributed file replication system
Formal methods and hybrid real-time systems
WiDS checker: combating bugs in distributed systems
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Proceedings of the 14th Communications and Networking Symposium
Mesmerizer: a effective tool for a complete peer-to-peer software development life-cycle
Proceedings of the 4th International ICST Conference on Simulation Tools and Techniques
The state of peer-to-peer network simulators
ACM Computing Surveys (CSUR)
Hi-index | 0.01 |
Current simulation technologies support at most hundreds of thousands of nodes, and fall short on the emerging large-scale networking systems that usually involve millions of nodes. We meet this challenge with our distributed simulation engine that is able to run millions of instances and is tested with a production P2P protocol, using commodity PC clusters. This simulation engine is part of the WiDS toolkit, which takes a holistic approach to the research and development of distributed systems. We also propose a critical optimization, called Slow Message Relaxation (SMR), to trade simulation accuracy for performance. By taking advantage of the fact that distributed protocols are resilient to network fluctuation, SMR executes events in a logical time window much wider than the conventional lookahead scheme allows. We analyze and bound the potential effect of the distortion on application logic and other general metrics. Our experiments demonstrate that the simulation engine is able to achieve order of a magnitude speedup with statistically accurate simulation results.