A machine learning approach to identifying database sessions using unlabeled data

  • Authors:
  • Qingsong Yao;Xiangji Huang;Aijun An

  • Affiliations:
  • York University, Toronto, Canada;York University, Toronto, Canada;York University, Toronto, Canada

  • Venue:
  • DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we describe a novel co-training based algorithm for identifying database user sessions from database traces. The algorithm learns to identify positive data (session boundaries) and negative data (non-session boundaries) incrementally by using two methods interactively in several iterations. In each iteration, previous identified positive and negative data are used to build better models, which in turn can label some new data and improve performance of further iterations. We also present experimental results.