Topic-Based Audience Metrics for Internet Marketing by Combining Ontologies and Output Page Mining

  • Authors:
  • Jean-Pierre Norguet;Esteban Zimanyi

  • Affiliations:
  • Universite Libre de Bruxelles, Brussels, Belgium;Universite Libre de Bruxelles, Brussels, Belgium

  • Venue:
  • CIMCA '05 Proceedings of the International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce Vol-2 (CIMCA-IAWTIC'06) - Volume 02
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In Internet marketing, Web audience analysis is essential to understanding the visitors' needs. However, the existing analysis tools fail to deliver summarized and conceptual metrics needed by organization managers and Web site editors. The reason is that HTTP transaction metadata mined by these tools do not include the text content sent to the browsers. In this paper, we first describe the various methods that we conceived to mine the Web pages output by Web servers. These methods include content journaling, script parsing, server monitoring, network monitoring, and client-side mining. Then, for a given ontology, we count the occurrences of ontology entries in the mined pages, and we compare the results to the term weights in the online pages. By aggregating the metrics in the ontology, we obtain audience metrics which should represent the Web site topics. Finally, we validate our approach with experiments on real data using SQL Server OLAP and our prototype WASA.