Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Walking in facebook: a case study of unbiased sampling of OSNs
INFOCOM'10 Proceedings of the 29th conference on Information communications
Detecting and tracking disease outbreaks by mining social media data
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
We investigate sampling techniques in unbalanced heterogeneous bipartite graphs (UHBGs), which have wide applications in real world web-scale social networks. We propose random walked-based link sampling and stratified sampling for UHBGs and show that they have advantages over generic random walk samplers. In addition, each sampler's node degree distribution parameter estimator statistic is analytically derived to be used as a quality indicator. In the experiments, we apply the two sampling techniques, with a baseline node sampling method, to both synthetic and real Facebook data. The experimental results show that random walk-based stratified sampler has significant advantage over node sampler and link sampler on UHBGs.