A measurement study of Napster and Gnutella as examples of peer-to-peer file sharing systems
ACM SIGCOMM Computer Communication Review
Analyzing peer-to-peer traffic across large networks
IEEE/ACM Transactions on Networking (TON)
Characterizing the query behavior in peer-to-peer file sharing systems
Proceedings of the 4th ACM SIGCOMM conference on Internet measurement
Peer-to-Peer: Is Deviant Behavior the Norm on P2P File-Sharing Networks?
IEEE Distributed Systems Online
Understanding churn in peer-to-peer networks
Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
An analysis of peer-to-peer file-sharing system queries
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Issues and etiquette concerning use of shared measurement data
Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
Robust De-anonymization of Large Sparse Datasets
SP '08 Proceedings of the 2008 IEEE Symposium on Security and Privacy
Supporting Law Enforcement in Digital Communities through Natural Language Analysis
IWCF '08 Proceedings of the 2nd international workshop on Computational Forensics
ICDMW '08 Proceedings of the 2008 IEEE International Conference on Data Mining Workshops
Ten weeks in the life of an eDonkey server
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Song Clustering Using Peer-to-Peer Co-occurrences
ISM '09 Proceedings of the 2009 11th IEEE International Symposium on Multimedia
Outskewer: Using Skewness to Spot Outliers in Samples and Time Series
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Hi-index | 0.00 |
Increasing knowledge of paedophile activity in P2P systems is a crucial societal concern, with important consequences on child protection, policy making, and internet regulation. Because of a lack of traces of P2P exchanges and rigorous analysis methodology, however, current knowledge of this activity remains very limited. We consider here a widely used P2P system, eDonkey, and focus on two key statistics: the fraction of paedophile queries entered in the system and the fraction of users who entered such queries. We collect hundreds of millions of keyword-based queries; we design a paedophile query detection tool for which we establish false positive and false negative rates using assessment by experts; with this tool and these rates, we then estimate the fraction of paedophile queries in our data; finally, we design and apply methods for quantifying users who entered such queries. We conclude that approximately 0.25% of queries are paedophile, and that more than 0.2% of users enter such queries. These statistics are by far the most precise and reliable ever obtained in this domain.