The space complexity of approximating the frequency moments
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Using path profiles to predict HTTP requests
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Tracking join and self-join sizes in limited storage
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A framework for measuring changes in data characteristics
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient mining of emerging patterns: discovering trends and differences
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Synopsis data structures for massive data sets
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Mining high-speed data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining time-changing data streams
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Real world performance of association rule algorithms
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Mining data streams under block evolution
ACM SIGKDD Explorations Newsletter
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Maintaining stream statistics over sliding windows: (extended abstract)
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Continuous queries over data streams
ACM SIGMOD Record
Efficient Data Mining for Path Traversal Patterns
IEEE Transactions on Knowledge and Data Engineering
Computing Iceberg Queries Efficiently
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Frequency Estimation of Internet Packet Streams with Limited Space
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
A simple algorithm for finding frequent elements in streams and bags
ACM Transactions on Database Systems (TODS)
Issues in data stream management
ACM SIGMOD Record
An Approximate L1-Difference Algorithm for Massive Data Streams
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Web usage mining: discovery and applications of usage patterns from Web data
ACM SIGKDD Explorations Newsletter
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Streaming-Data Algorithms for High-Quality Clustering
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Web Mining: Information and Pattern Discovery on the World Wide Web
ICTAI '97 Proceedings of the 9th International Conference on Tools with Artificial Intelligence
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach
Data Mining and Knowledge Discovery
Finding recent frequent itemsets adaptively over online data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
On mining webclick streams for path traversal patterns
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach
IEEE Transactions on Knowledge and Data Engineering
What's hot and what's not: tracking most frequent items dynamically
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
Multi-dimensional regression analysis of time-series data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
StatStream: statistical monitoring of thousands of data streams in real time
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Decision trees for web log mining
Intelligent Data Analysis
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Approximate mining of maximal frequent itemsets in data streams with different window models
Expert Systems with Applications: An International Journal
A sliding window method for finding top-k path traversal patterns over streaming Web click-sequences
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
Mining Web navigation patterns with a path traversal graph
Expert Systems with Applications: An International Journal
Fuzzy classification in web usage mining using fuzzy quantifiers
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Sliding window based weighted maximal frequent pattern mining over data streams
Expert Systems with Applications: An International Journal
Mining maximal frequent patterns by considering weight conditions over data streams
Knowledge-Based Systems
Hi-index | 0.01 |
Mining Web click streams is an important data mining problem with broad applications. However, it is also a difficult problem since the streaming data possess some interesting characteristics, such as unknown or unbounded length, possibly a very fast arrival rate, inability to backtrack over previously arrived click-sequences, and a lack of system control over the order in which the data arrive. In this paper, we propose a projection-based, single-pass algorithm, called DSM-PLW (Data Stream Mining for Path traversal patterns in a Landmark Window), for online incremental mining of path traversal patterns over a continuous stream of maximal forward references generated at a rapid rate. According to the algorithm, each maximal forward reference of the stream is projected into a set of reference-suffix maximal forward references, and these reference-suffix maximal forward references are inserted into a new in-memory summary data structure, called SP-forest (Summary Path traversal pattern forest), which is an extended prefix tree-based data structure for storing essential information about frequent reference sequences of the stream so far. The set of all maximal reference sequences is determined from the SP-forest by a depth-first-search mechanism, called MRS-mining (Maximal Reference Sequence mining). Theoretical analysis and experimental studies show that the proposed algorithm has gently growing memory requirements and makes only one pass over the streaming data.