Random sampling from a search engine's index
Proceedings of the 15th international conference on World Wide Web
On unbiased sampling for unstructured peer-to-peer networks
Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Analysis of topological characteristics of huge online social networking services
Proceedings of the 16th international conference on World Wide Web
I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system
Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
Youtube traffic characterization: a view from the edge
Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
Measurement and analysis of online social networks
Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
User interactions in social networks and their implications
Proceedings of the 4th ACM European conference on Computer systems
Proceedings of the ACM SIGCOMM 2010 conference
Estimating and sampling graphs with multidimensional random walks
IMC '10 Proceedings of the 10th ACM SIGCOMM conference on Internet measurement
YouTube traffic dynamics and its interplay with a tier-1 ISP: an ISP perspective
IMC '10 Proceedings of the 10th ACM SIGCOMM conference on Internet measurement
Estimating the Size of Online Social Networks
SOCIALCOM '10 Proceedings of the 2010 IEEE Second International Conference on Social Computing
Sizing up online social networks
IEEE Network: The Magazine of Global Internetworking
Estimating sizes of social networks via biased sampling
Proceedings of the 20th international conference on World wide web
YouTube everywhere: impact of device and infrastructure synergies on user experience
Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference
Dissecting foursquare venue popularity via random region sampling
Proceedings of the 2012 ACM conference on CoNEXT student workshop
Detect inflated follower numbers in OSN using star sampling
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Internet video delivery in youtube: from traffic measurements to quality of experience
DataTraffic Monitoring and Analysis
Hi-index | 0.00 |
Leveraging the characteristics of YouTube video id space and exploiting a unique property of YouTube search API, in this paper we develop a random prefix sampling method to estimate the total number of videos hosted by YouTube. Through theoretical modeling and analysis, we demonstrate that the estimator based on this method is unbiased, and provide bounds on its variance and confidence interval. These bounds enable us to judiciously select sample sizes to control estimation errors. We evaluate our sampling method and validate the sampling results using two distinct collections of YouTube video id's (namely, treating each collection as if it were the "true" collection of YouTube videos). We then apply our sampling method to the live YouTube system, and estimate that there are a total of roughly 500 millions YouTube videos by May, 2011. Finally, using an unbiased collection of YouTube videos sampled by our method, we show that YouTube video view count statistics collected by prior methods (e.g., through crawling of related video links) are highly skewed, significantly under-estimating the number of videos with very small view counts (