Detecting online commercial intention (OCI)

  • Authors:
  • Honghua (Kathy) Dai;Lingzhi Zhao;Zaiqing Nie;Ji-Rong Wen;Lee Wang;Ying Li

  • Affiliations:
  • Microsoft Corporation, Redmond, WA;Tsinghua University, Beijing, China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;N/A;Microsoft Corporation, Redmond, WA

  • Venue:
  • Proceedings of the 15th international conference on World Wide Web
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Understanding goals and preferences behind a user's online activities can greatly help information providers, such as search engine and E-Commerce web sites, to personalize contents and thus improve user satisfaction. Understanding a user's intention could also provide other business advantages to information providers. For example, information providers can decide whether to display commercial content based on user's intent to purchase. Previous work on Web search defines three major types of user search goals for search queries: navigational, informational and transactional or resource [1][7]. In this paper, we focus our attention on capturing commercial intention from search queries and Web pages, i.e., when a user submits the query or browse a Web page, whether he/she is about to commit or in the middle of a commercial activity, such as purchase, auction, selling, paid service, etc. We call the commercial intentions behind a user's online activities as OCI (Online Commercial Intention). We also propose the notion of "Commercial Activity Phase" (CAP), which identifies in which phase a user is in his/her commercial activities: Research or Commit. We present the framework of building machine learning models to learn OCI based on any Web page content. Based on that framework, we build models to detect OCI from search queries and Web pages. We train machine learning models from two types of data sources for a given search query: content of algorithmic search result page(s) and contents of top sites returned by a search engine. Our experiments show that the model based on the first data source achieved better performance. We also discover that frequent queries are more likely to have commercial intention. Finally we propose our future work in learning richer commercial intention behind users' online activities.