From user access patterns to dynamic hypertext linking
Proceedings of the fifth international World Wide Web conference on Computer networks and ISDN systems
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Information storage and retrieval
Information storage and retrieval
Adaptive Web sites: automatically synthesizing Web pages
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Requirements for clustering data streams
ACM SIGKDD Explorations Newsletter
Continuous queries over data streams
ACM SIGMOD Record
Discovering Web Access Patterns and Trends by Applying OLAP and Data Mining Technology on Web Logs
ADL '98 Proceedings of the Advances in Digital Libraries Conference
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Knowledge discovery from users Web-page navigation
RIDE '97 Proceedings of the 7th International Workshop on Research Issues in Data Engineering (RIDE '97) High Performance Database Management for Large-Scale Applications
Web usage mining: discovery and applications of usage patterns from Web data
ACM SIGKDD Explorations Newsletter
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Multi-dimensional regression analysis of time-series data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Collaborative filtering in dynamic usage environments
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Knowledge worker intranet behaviour and usability
International Journal of Business Intelligence and Data Mining
Artificial Immune System Based Robot Anomaly Detection Engine for Fault Tolerant Robots
ATC '08 Proceedings of the 5th international conference on Autonomic and Trusted Computing
Query based optimal web site clustering using simulated annealing
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Merging algorithm to reduce dimensionality in application to web-mining
AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
Hi-index | 0.00 |
While scalable data mining methods are expected to cope with massive Web data, coping with evolving trends in noisy data in a continuous fashion, and without any unnecessary stoppages and reconfigurations is still an open challenge. This dynamic and single pass setting can be cast within the framework of mining evolving data streams. In this paper, we explore the task of mining mass user profiles by discovering evolving Web session clusters in a single pass with a recently proposed scalable immune based clustering approach (TECNO-STREAMS), and study the effect of the choice of different similarity measures on the mining process and on the interpretation of the mined patterns. We propose a simple similarity measure that has the advantage of explicitly coupling the precision and coverage criteria to the early learning stages, and furthermore requiring that the affinity of the data to the learned profiles or summaries be defined by the minimum of their coverage or precision, hence requiring that the learned profiles are simultaneously precise and complete, with no compromises.In our experiments, we study the task of mining evolving user profiles from Web clickstream data (web usage mining) in a single pass, and under different trend sequencing scenarios, showing that compared oto the cosine similarity measure, the proposed similarity measure explicitly based on precision and coverage allows the discovery of more correct profiles at the same precision or recall quality levels.