On the design and performance of prefix-preserving IP traffic trace anonymization
IMW '01 Proceedings of the 1st ACM SIGCOMM Workshop on Internet Measurement
The CoralReef Software Suite as a Tool for System and Network Administrators
LISA '01 Proceedings of the 15th USENIX conference on System administration
Understanding the network-level behavior of spammers
Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications
Workload models of spam and legitimate e-mails
Performance Evaluation
Analysis of internet backbone traffic and header anomalies observed
Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
Peeking into spammer behavior from a unique vantage point
LEET'08 Proceedings of the 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats
Inferring Spammers in the Network Core
PAM '09 Proceedings of the 10th International Conference on Passive and Active Network Measurement
Analysis of UDP Traffic Usage on Internet Backbone Links
SAINT '09 Proceedings of the 2009 Ninth Annual International Symposium on Applications and the Internet
Review: Passive internet measurement: Overview and guidelines based on experiences
Computer Communications
Trends and differences in connection-behavior within classes of internet backbone traffic
PAM'08 Proceedings of the 9th international conference on Passive and active network measurement
Estimating routing symmetry on single links by passive flow measurements
Proceedings of the 6th International Wireless Communications and Mobile Computing Conference
Detection of spam hosts and spam bots using network flow traffic modeling
LEET'10 Proceedings of the 3rd USENIX conference on Large-scale exploits and emergent threats: botnets, spyware, worms, and more
Tracking malicious hosts on a 10gbps backbone link
NordSec'10 Proceedings of the 15th Nordic conference on Information Security Technology for Applications
Towards modeling legitimate and unsolicited email traffic using social network properties
Proceedings of the Fifth Workshop on Social Network Systems
An evaluation of community detection algorithms on large-scale email traffic
SEA'12 Proceedings of the 11th international conference on Experimental Algorithms
STONE: a stream-based DDoS defense framework
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Hi-index | 0.00 |
We have collected several large-scale datasets in a number of passive measurement projects on an Internet backbone link belonging to a national university network. The datasets have been used in different studies such as in general classification and characterization of properties of Internet traffic, in network security projects detecting and classifying malicious traffic and hosts, and in studies of network-level properties of unsolicited e-mail (spam) traffic. The Antispam dataset alone contains traffic between more than 10 million e-mail addresses. In this paper we describe our datasets, the data collection methodology including experiences in collecting and processing data on a large scale. We have in particular selected a dataset belonging to an anti-spam project to show how a practical analysis of highly privacy-sensitive data can be done, in this case containing complete e-mail traffic. Not only do we show that it is possible to collect large datasets, we also show how to solve different issues regarding user privacy and give experiences from how to work with large datasets.