Online computation and competitive analysis
Online computation and competitive analysis
Summary cache: a scalable wide-area web cache sharing protocol
IEEE/ACM Transactions on Networking (TON)
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Code red worm propagation modeling and analysis
Proceedings of the 9th ACM conference on Computer and communications security
"Balls into Bins" - A Simple and Tight Analysis
RANDOM '98 Proceedings of the Second International Workshop on Randomization and Approximation Techniques in Computer Science
Probability and Computing: Randomized Algorithms and Probabilistic Analysis
Probability and Computing: Randomized Algorithms and Probabilistic Analysis
Detecting evasion attacks at high speeds without reassembly
Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications
Robust TCP stream reassembly in the presence of adversaries
SSYM'05 Proceedings of the 14th conference on USENIX Security Symposium - Volume 14
Efficient and Robust TCP Stream Normalization
SP '08 Proceedings of the 2008 IEEE Symposium on Security and Privacy
Deflating the big bang: fast and scalable deep packet inspection with extended finite automata
Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Network-wide deployment of intrusion detection and prevention systems
Proceedings of the 6th International COnference
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
We address the problem of collecting unique items in a large stream of information in the context of Intrusion Prevention Systems (IPSs). IPSs detect attacks at gigabit speeds and must log infected source IP addresses for remediation or forensics. An attack with millions of infected sources can result in hundreds of millions of log records when counting duplicates. If logging speeds are much slower than packet arrival rates and memory in the IPS is limited, scalable logging is a technical challenge. After showing that naïve approaches will not suffice, we solve the problem with a new algorithm we call Carousel. Carousel randomly partitions the set of sources into groups that can be logged without duplicates, and then cycles through the set of possible groups. We prove that Carousel collects almost all infected sources with high probability in close to optimal time as long as infected sources keep transmitting. We describe details of a Snort implementation and a hardware design. Simulations with worm propagation models show up to a factor of 10 improvement in collection times for practical scenarios. Our technique applies to any logging problem with non-cooperative sources as long as the information to be logged appears repeatedly.