C4.5: programs for machine learning
C4.5: programs for machine learning
Personalization from incomplete data: what you don't know can hurt
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Comparing Naive Bayes, Decision Trees, and SVM with AUC and Accuracy
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A Web page prediction model based on click-stream tree representation of user behavior
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Dynamic Conversion Behavior at E-Commerce Sites
Management Science
Personalizing search via automated analysis of interests and activities
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Two Cache Replacement Algorithms Based on Association Rules and Markov Models
SKG '05 Proceedings of the First International Conference on Semantics, Knowledge and Grid
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Modeling Online Browsing and Path Analysis Using Clickstream Data
Marketing Science
Hi-index | 0.00 |
Most existing personalization systems rely on site-centric user data, in which the inputs available to the system are the user's behaviors on a specific site. We use a dataset supplied by a major audience measurement company that represents a complete user-centric view of clickstream behavior. Using the supplied product purchase metadata to set up a prediction problem, we learn models of the user's probability of purchase within a time window for multiple product categories by using features that represent the user's browsing and search behavior on all websites. As a baseline, we compare our results to the best such models that can be learned from site-centric data at a major search engine site. We demonstrate substantial improvements in accuracy with comparable and often better recall. A novel behaviorally (as opposed to syntactically) based search term suggestion algorithm is also proposed for feature selection of clickstream data. Finally, our models are not privacy invasive. If deployed client-side, our models amount to a dynamic "smart cookie" that is expressive of a user's individual intentions with a precise probabilistic interpretation.