Inferring Demographic Attributes of Anonymus Internet Users

  • Authors:
  • Dan Murray;Kevan Durrell

  • Affiliations:
  • -;-

  • Venue:
  • WEBKDD '99 Revised Papers from the International Workshop on Web Usage Analysis and User Profiling
  • Year:
  • 1999

Quantified Score

Hi-index 0.01

Visualization

Abstract

Today it is quite common for web page content to include an advertisement. Since advertisers often want to target their message to people with certain demographic attributes, the anonymity of Internet users poses a special problem for them. The purpose of the present research is to find an effective way to infer demographic information (e.g. gender, age or income) about people who use the Internet but for whom demographic information is not otherwise available. Our hope is to build a high quality database of demographic profiles covering a large segment of the Internet population without having to survey each individual Internet user. Though Internet users are largely anonymous, they nonetheless provide a certain amount of usage information. Usage information includes, but is not limited to, (a) search terms entered by the Internet user and (b) web pages accessed by the Internet user. In this paper, we describe an application of the Latent Semantic Analysis (LSA) [1] information retrieval technique to construct a vector space in which we can represent the usage data associated with each Internet user of interest. Subsequently, we show how the LSA vector space enables us to produce demographic inferences by supplying the input to a three layer neural model trained using the scaled conjugate gradient (SCG) method.