Network flows: theory, algorithms, and applications
Network flows: theory, algorithms, and applications
A caching relay for the World Wide Web
Selected papers of the first conference on World-Wide Web
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
In search of reliable usage data on the WWW
Selected papers from the sixth international conference on World Wide Web
ACM SIGKDD Explorations Newsletter
Internet Ethics
Modern Information Retrieval
ISWC '02 Proceedings of the First International Semantic Web Conference on The Semantic Web
Web usage mining: discovery and applications of usage patterns from Web data
ACM SIGKDD Explorations Newsletter
Linux Apache Web Server Administration
Linux Apache Web Server Administration
A Framework for the Evaluation of Session Reconstruction Heuristics in Web-Usage Analysis
INFORMS Journal on Computing
The Reconstruction of User Sessions from a Server Log Using Improved Time-Oriented Heuristics
CNSR '04 Proceedings of the Second Annual Conference on Communication Networks and Services Research
Performance Comparison of Pattern Discovery Methods on Web Log Data
AICCSA '06 Proceedings of the IEEE International Conference on Computer Systems and Applications
On web browsing privacy in anonymized NetFlows
SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Recrawl scheduling based on information longevity
Proceedings of the 17th international conference on World Wide Web
Web User Session Reconstruction Using Integer Programming
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Proceedings of the 2009 workshop on Web Search Click Data
Adaptive Web Sites: A Knowledge Extraction from Web Data Approach - Volume 170 Frontiers in Artificial Intelligence and Applications
Challenge and solutions of NAT traversal for ubiquitous and pervasive applications on the Internet
Journal of Systems and Software
Web User Session Reconstruction with Back Button Browsing
KES '09 Proceedings of the 13th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems: Part I
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Separating Interleaved User Sessions from Web Log
NCIS '11 Proceedings of the 2011 International Conference on Network Computing and Information Security - Volume 01
Data mining in soft computing framework: a survey
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Web usage mining has proven to be an important advance for e-business systems, both by finding web user buying patterns and suggesting ways to improve web user navigation. A primary input for web usage mining is web user sessions that must be constructed from web server logs called sessionization when such sessions are not otherwise identified. We use bipartite cardinality matching and a more general integer program to construct sessions. We also propose several variations of our integer program to provide additional insights into session characteristics. For testing, we retrieve 15 months of web server logs and corresponding real sessions from an academic web site. We compare real sessions, results obtained by our optimization models, and results from a commonly-used timeout heuristic. We find our optimization models dominate the timeout heuristic using several comparison measures. Solution time for a typical month is seven hours for our integer program, 30 minutes for our bipartite cardinality matching, and about 1 minute for the heuristic. Although solution time is significantly greater for the integer program, its variations contribute additional analysis of web user behavior.