Towards Knowledge Discovery from WWW Log Data

Authors:
Feng Tao;Fionn Murtagh
Affiliations:
-;-
Venue:
ITCC '00 Proceedings of the The International Conference on Information Technology: Coding and Computing (ITCC'00)
Year:
2000

Citing 11
Cited 2

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
How people revisit web pages: empirical findings and implications for the design of history systems

International Journal of Human-Computer Studies - Special issue: World Wide Web usability
Efficient enumeration of frequent sequences

Proceedings of the seventh international conference on Information and knowledge management
Exploiting regularities in Web traffic patterns for cache replacement

STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Discovering Internet marketing intelligence through online analytical web usage mining

ACM SIGMOD Record
Beyond Market Baskets: Generalizing Association Rules to Dependence Rules

Data Mining and Knowledge Discovery
Discovery-Driven Exploration of OLAP Data Cubes

EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Modeling Multidimensional Databases

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases

A design and implementation of a web server log file analyzer

WSEAS Transactions on Information Science and Applications
Distribution of lecture concepts and relations in digital contents

NBiS'07 Proceedings of the 1st international conference on Network-based information systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

As the result of interactions between visitors and a web site, an http log file contains very rich knowledge about users on-site behaviors, which, if fully exploited, can better customer services and site performance. Different to most of the existing log analysis tools which use statistical counting summaries on pages, hosts, etc., we propose a transaction model to represent users access history and a framework to adapt data mining techniques such as sequence and association rule mining to these transactions. In this framework, all transactions are extracted from the raw log file though a series of systematic data preparation phases. We discuss different methods to identify a user, and separate long convoluted sequences into semantically meaningful sessions and transactions. A new feature called interestingness is defined to model user interests in different web sections. With all the transactions being imported into an adapted cube structure with a concept hierarchy attached to each dimension of it, it is possible to carry out multi-dimensional data mining at multi-abstract levels. Using interest context rules, we demonstrate the potentially significant meaning of this system prototype.