Building consumer trust online
Communications of the ACM
Privacy in e-commerce: examining user scenarios and privacy preferences
Proceedings of the 1st ACM conference on Electronic commerce
Cookies and Web browser design: toward realizing informed consent online
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Effective personalization based on association rule discovery from web usage data
Proceedings of the 3rd international workshop on Web information and data management
Web mining for web personalization
ACM Transactions on Internet Technology (TOIT)
Web usage mining: discovery and applications of usage patterns from Web data
ACM SIGKDD Explorations Newsletter
Privacy in e-commerce: stated preferences vs. actual behavior
Communications of the ACM - Transforming China
Determining an author's native language by mining a text for errors
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Applying link-based classification to label blogs
Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis
Information Retrieval on the Blogosphere
Foundations and Trends in Information Retrieval
Hi-index | 0.00 |
One challenge for content providers on the Web is determining who consumes their content. For instance, online newspapers want to know who is reading their articles. Previous approaches have tried to determine such audience demographics by placing cookies on users' systems, or by directly asking consumers (e.g., through surveys). The first approach may make users uncomfortable, and the second is not scalable. In this paper we focus on determining the demographics of a Website's audience by analyzing the blogs that link to the Website. We analyze both the text of the blogs and the network connectivity of the blog network to determine demographics such as whether a person "is married" or "has pets." Presumably bloggers linking to sites also consume the content of those sites. Therefore, the discovered demographics for the bloggers can be used to represent a proxy set of demographics for a subset of the Website's consumers. We demonstrate that in many cases we can infer sub-audiences for a site from these demographics. Further, this feasibility demonstrates that very specific demographics for sites can be generated as we improve the methods for determining them (e.g., finding people who play video games). In our study we analyze blogs collected from more than 590,000 bloggers collected over a six month period that link to more than 488,000 distinct, external websites.