Random sampling with a reservoir
ACM Transactions on Mathematical Software (TOMS)
Probabilistic counting algorithms for data base applications
Journal of Computer and System Sciences
The space complexity of approximating the frequency moments
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Maintaining stream statistics over sliding windows: (extended abstract)
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Frequency Estimation of Internet Packet Streams with Limited Space
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
What's hot and what's not: tracking most frequent items dynamically
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
An Approximate L1-Difference Algorithm for Massive Data Streams
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Using association rules for fraud detection in web advertising networks
VLDB '05 Proceedings of the 31st international conference on Very large data bases
An Algorithm for In-Core Frequent Itemset Mining on Streaming Data
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Research issues in data stream association rule mining
ACM SIGMOD Record
Online mining of frequent query trees over XML data streams
Proceedings of the 15th international conference on World Wide Web
Online Random Shuffling of Large Database Tables
IEEE Transactions on Knowledge and Data Engineering
Quality-Aware Sampling and Its Applications in Incremental Data Mining
IEEE Transactions on Knowledge and Data Engineering
Answering ad hoc aggregate queries from data streams using prefix aggregate trees
Knowledge and Information Systems
Discovering frequent sets from data streams with CPU constraint
AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
Mining top-k frequent patterns in the presence of the memory constraint
The VLDB Journal — The International Journal on Very Large Data Bases
The VLDB Journal — The International Journal on Very Large Data Bases
A survey on algorithms for mining frequent itemsets over data streams
Knowledge and Information Systems
Mining frequent items in a stream using flexible windows
Intelligent Data Analysis - Knowledge Discovery from Data Streams
Online mining of frequent sets in data streams with error guarantee
Knowledge and Information Systems
DELAY: A Lazy Approach for Mining Frequent Patterns over High Speed Data Streams
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Efficient Mining of Frequent Itemsets from Data Streams
BNCOD '08 Proceedings of the 25th British national conference on Databases: Sharing Data, Information and Knowledge
IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
DSM-FI: an efficient algorithm for mining frequent itemsets in data streams
Knowledge and Information Systems
Maintaining frequent closed itemsets over a sliding window
Journal of Intelligent Information Systems
Feature-preserved sampling over streaming data
ACM Transactions on Knowledge Discovery from Data (TKDD)
Efficient algorithms for incremental maintenance of closed sequential patterns in large databases
Data & Knowledge Engineering
Incremental updates of closed frequent itemsets over continuous data streams
Expert Systems with Applications: An International Journal
Efficient query processing on graph databases
ACM Transactions on Database Systems (TODS)
Frequent items in streaming data: An experimental evaluation of the state-of-the-art
Data & Knowledge Engineering
Interactive mining of top-K frequent closed itemsets from data streams
Expert Systems with Applications: An International Journal
Data Mining and Knowledge Discovery
Sliding window-based frequent pattern mining over data streams
Information Sciences: an International Journal
Which Is Better for Frequent Pattern Mining: Approximate Counting or Sampling?
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Efficient itemset generator discovery over a stream sliding window
Proceedings of the 18th ACM conference on Information and knowledge management
Mining frequent itemsets in time-varying data streams
Proceedings of the 18th ACM conference on Information and knowledge management
Approximate Frequent Itemset Discovery from Data Stream
AI*IA '09: Proceedings of the XIth International Conference of the Italian Association for Artificial Intelligence Reggio Emilia on Emergent Perspectives in Artificial Intelligence
Finding frequent items in data streams using ESBF
PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
CLAIM: an efficient method for relaxed frequent closed itemsets mining over stream data
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
A new algorithm for mining global frequent itemsets in a stream
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 5
A test paradigm for detecting changes in transactional data streams
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Open user schema guided evaluation of streaming RDF queries
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Mining informative rule set for prediction over a sliding window
ACIIDS'10 Proceedings of the Second international conference on Intelligent information and database systems: Part II
On dense pattern mining in graph streams
Proceedings of the VLDB Endowment
A method of extracting malicious expressions in bulletin board systems by using context analysis
Information Processing and Management: an International Journal
Discovery of frequent patterns in transactional data streams
Transactions on large-scale data- and knowledge-centered systems II
Discovery of frequent patterns in transactional data streams
Transactions on large-scale data- and knowledge-centered systems II
Mining frequent itemsets over distributed data streams by continuously maintaining a global synopsis
Data Mining and Knowledge Discovery
A generic approach for mining indirect association rules in data streams
IEA/AIE'11 Proceedings of the 24th international conference on Industrial engineering and other applications of applied intelligent systems conference on Modern approaches in applied intelligence - Volume Part I
Frequent pattern mining from time-fading streams of uncertain data
DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
Search method of time sensitive frequent itemsets in data streams
CIARP'06 Proceedings of the 11th Iberoamerican conference on Progress in Pattern Recognition, Image Analysis and Applications
DAPSS: exact subsequence matching for data streams
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Maintaining frequent itemsets over high-speed data streams
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
EStream: online mining of frequent sets with precise error guarantee
DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
Adaptive load shedding for mining frequent patterns from data streams
DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
Error-adaptive and time-aware maintenance of frequency counts over data streams
WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
False-Negative frequent items mining from data streams with bursting
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
A false negative approach to mining frequent itemsets from high speed transactional data streams
Information Sciences: an International Journal
A false negative maximal frequent itemset mining algorithm over stream
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
A sliding window-based false-negative approach for ubiquitous data stream analysis
International Journal of Communication Systems
RDF pattern matching using sortable views
Proceedings of the 21st ACM international conference on Information and knowledge management
An adaptive algorithm for finding frequent sets in landmark windows
SUM'12 Proceedings of the 6th international conference on Scalable Uncertainty Management
Evaluation of RDF queries via equivalence
Frontiers of Computer Science: Selected Publications from Chinese Universities
Identifying streaming frequent items in ad hoc time windows
Data & Knowledge Engineering
Mining frequent itemsets in a stream
Information Systems
Efficient frequent itemset mining methods over time-sensitive streams
Knowledge-Based Systems
Hi-index | 0.00 |
The problem of finding frequent items has been recently studied over high speed data streams. However, mining frequent itemsets from transactional data streams has not been well addressed yet in terms of its bounds of memory consumption. The main difficulty is due to the nature of the exponential explosion of itemsets. Given a domain of I unique items, the possible number of itemsets can be up to 2I - 1. When the length of data streams approaches to a very large number N, the possibility of an itemset to be frequent becomes larger and difficult to track with limited memory. However, the real killer of effective frequent itemset mining is that most of existing algorithms are false-positive oriented. That is, they control memory consumption in the counting processes by an error parameter ε, and allow items with support below the specified minimum support s but above s-ε counted as frequent ones. Such false-positive items increase the number of false-positive frequent itemsets exponentially, which may make the problem computationally intractable with bounded memory consumption. In this paper, we developed algorithms that can effectively mine frequent item(set)s from high speed transactional data streams with a bound of memory consumption. While our algorithms are false-negative oriented, that is, certain frequent itemsets may not appear in the results, the number of false-negative itemsets can be controlled by a predefined parameter so that desired recall rate of frequent itemsets can be guaranteed. We developed algorithms based on Chernoff bound. Our extensive experimental studies show that the proposed algorithms have high accuracy, require less memory, and consume less CPU time. They significantly outperform the existing false-positive algorithms.