Bayesian classification (AutoClass): theory and results
Advances in knowledge discovery and data mining
Measuring ISP topologies with rocketfuel
Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications
BGP routing stability of popular destinations
Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurment
BLINC: multilevel traffic classification in the dark
Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
Automated Traffic Classification and Application Identification using Machine Learning
LCN '05 Proceedings of the The IEEE Conference on Local Computer Networks 30th Anniversary
PRIMED: community-of-interest-based DDoS mitigation
Proceedings of the 2006 SIGCOMM workshop on Large-scale attack defense
Traffic classification using clustering algorithms
Proceedings of the 2006 SIGCOMM workshop on Mining network data
Is sampled data sufficient for anomaly detection?
Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Measurement and analysis of online social networks
Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
iPlane: an information plane for distributed services
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Unconstrained endpoint profiling (googling the internet)
Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Internet traffic classification demystified: myths, caveats, and the best practices
CoNEXT '08 Proceedings of the 2008 ACM CoNEXT Conference
PAM'07 Proceedings of the 8th international conference on Passive and active network measurement
Analysis of peer-to-peer traffic on ADSL
PAM'05 Proceedings of the 6th international conference on Passive and Active Network Measurement
A hybrid approach for personalized recommendation of news on the Web
Expert Systems with Applications: An International Journal
Pinning Synchronization for a General Complex Networks with Multiple Time-Varying Coupling Delays
Neural Processing Letters
Mosaic: quantifying privacy leakage in mobile networks
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
ACM SIGCOMM Computer Communication Review
Hi-index | 0.00 |
Understanding Internet access trends at a global scale, i.e., how people use the Internet, is a challenging problem that is typically addressed by analyzing network traces. However, obtaining such traces presents its own set of challenges owing to either privacy concerns or to other operational difficulties. The key hypothesis of our work here is that most of the information needed to profile the Internet endpoints is already available around us--on the Web. In this paper, we introduce a novel approach for profiling and classifying endpoints. We implement and deploy a Google-based profiling tool, that accurately characterizes endpoint behavior by collecting and strategically combining information freely available on the Web. Our Web-based "unconstrained endpoint profiling" (UEP) approach shows advances in the following scenarios: 1) even when no packet traces are available, it can accurately infer application and protocol usage trends at arbitrary networks; 2) when network traces are available, it outperforms state-of-the-art classification tools such as BLINC; 3) when sampled flow-level traces are available, it retains high classification capabilities. We explore other complementary UEP approaches, such as p2p- and reverse-DNS-lookup-based schemes, and show that they can further improve the results of the Web-based UEP. Using this approach, we perform unconstrained endpoint profiling at a global scale: for clients in four different world regions (Asia, South and North America, and Europe). We provide the first-of-its-kind endpoint analysis that reveals fascinating similarities and differences among these regions.