Measuring index quality using random walks on the Web
WWW '99 Proceedings of the eighth international conference on World Wide Web
Communications of the ACM - The Blogosphere
Interpreting social science link analysis research: A theoretical framework
Journal of the American Society for Information Science and Technology
Weblog success: Exploring the role of technology
International Journal of Human-Computer Studies - Human-computer interaction research in the managemant information systems discipline
Hi-index | 0.00 |
Blogs are arguably the most popular genre of user-generated content (UGC), which make blogs a gold mine for social science research. However, existing research on blogs has suffered from nonprobability samples collected either manually or by computerized crawling based on random walks method. The current article presents a probability sampling method for blogs, called random digit search (RDS), that is modified from the popular 芒聙聵芒聙聵random digit dialing芒聙聶芒聙聶 (RDD) method used in telephone surveys. The RDS method was tested in a study of Sina Blog, a popular blog service provider (BSP) in China. The results show that, while 芒聙聵芒聙聵random walks芒聙聶芒聙聶 sampling tends to oversample popular/active blogs, probability samples generated by RDS yield consistent and precise estimates of population parameters. Although the RDS takes advantage of the numeric identification (ID) system used on Sina Blog, the general principles may be applicable to other BSPs and many other genres of UGC.