New sampling-based summary statistics for improving approximate query answers
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
The space complexity of approximating the frequency moments
Journal of Computer and System Sciences
Synopsis data structures for massive data sets
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Testing and spot-checking of data streams (extended abstract)
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Database-friendly random projections
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fast, small-space algorithms for approximate histogram maintenance
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Computing Iceberg Queries Efficiently
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
An Approximate L1-Difference Algorithm for Massive Data Streams
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Stable distributions, pseudorandom generators, embeddings and data stream computation
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Pass efficient algorithms for approximating large matrices
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Frequency Estimation of Internet Packet Streams with Limited Space
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
A simple algorithm for finding frequent elements in streams and bags
ACM Transactions on Database Systems (TODS)
Correlating XML data streams using tree-edit distance embeddings
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
What's hot and what's not: tracking most frequent items dynamically
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient URL caching for world wide web crawling
WWW '03 Proceedings of the 12th international conference on World Wide Web
Issues in data stream management
ACM SIGMOD Record
Bursty and Hierarchical Structure in Streams
Data Mining and Knowledge Discovery
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Space-code bloom filter for efficient traffic flow measurement
Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Identifying frequent items in sliding windows over on-line packet streams
Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Sketch-based change detection: methods, evaluation, and applications
Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Finding recent frequent itemsets adaptively over online data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Dynamically maintaining frequent items over a data stream
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Tabulation based 4-universal hashing with applications to second moment estimation
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Holistic UDAFs at streaming speeds
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Approximation techniques for spatial data
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Finding hot query patterns over an XQuery stream
The VLDB Journal — The International Journal on Very Large Data Bases
Tracking set-expression cardinalities over continuous update streams
The VLDB Journal — The International Journal on Very Large Data Bases
Finding (Recently) Frequent Items in Distributed Data Streams
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Approximate counts and quantiles over sliding windows
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
estWin: Online data stream mining of recent frequent itemsets by sliding window method
Journal of Information Science
Coresets in dynamic geometric data streams
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
Duplicate detection in click streams
WWW '05 Proceedings of the 14th international conference on World Wide Web
What's hot and what's not: tracking most frequent items dynamically
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
XML stream processing using tree-edit distance embeddings
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
Space efficient mining of multigraph streams
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Space complexity of hierarchical heavy hitters in multi-dimensional data streams
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
SPASS: scalable and energy-efficient data acquisition in sensor databases
Proceedings of the 4th ACM international workshop on Data engineering for wireless and mobile access
Fast and approximate stream mining of quantiles and frequencies using graphics processors
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Improved range-summable random variable construction algorithms
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Domain-Driven Data Synopses for Dynamic Quantiles
IEEE Transactions on Knowledge and Data Engineering
An improved data stream summary: the count-min sketch and its applications
Journal of Algorithms
Efficient mining method for retrieving sequential patterns over online data streams
Journal of Information Science
Sketching streams through the net: distributed approximate query tracking
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Using association rules for fraud detection in web advertising networks
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Ranking flows from sampled traffic
CoNEXT '05 Proceedings of the 2005 ACM conference on Emerging network experiment and technology
Simpler algorithm for estimating frequency moments of data streams
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Maintaining significant stream statistics over sliding windows
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
What's new: finding significant differences in network data streams
IEEE/ACM Transactions on Networking (TON)
To randomize or not to randomize: space optimal summaries for hyperlink analysis
Proceedings of the 15th international conference on World Wide Web
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A simpler and more efficient deterministic scheme for finding frequent items over sliding windows
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Finding global icebergs over distributed data sets
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A geometric approach to monitoring threshold functions over distributed data streams
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
DSM-PLW: single-pass mining of path traversal patterns over streaming web click-sequences
Computer Networks: The International Journal of Computer and Telecommunications Networking - Web dynamics
An integrated efficient solution for computing frequent and top-k elements in data streams
ACM Transactions on Database Systems (TODS)
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Mining evolving data streams for frequent patterns
Pattern Recognition
Interpreting the data: Parallel analysis with Sawzall
Scientific Programming - Dynamic Grids and Worldwide Computing
Statistical analysis of sketch estimators
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Fast data stream algorithms using associative memories
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Improving sketch reconstruction accuracy using linear least squares method
IMC '05 Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement
Progressive ranking of range aggregates
Data & Knowledge Engineering
Estimating entropy over data streams
ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
A genetic algorithm for dynamic modelling and prediction of activity in document streams
Proceedings of the 9th annual conference on Genetic and evolutionary computation
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A geometric approach to monitoring threshold functions over distributed data streams
ACM Transactions on Database Systems (TODS)
Finding hierarchical heavy hitters in data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Distributed set-expression cardinality estimation
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Finding hierarchical heavy hitters in streaming data
ACM Transactions on Knowledge Discovery from Data (TKDD)
Probabilistic lossy counting: an efficient algorithm for finding heavy hitters
ACM SIGCOMM Computer Communication Review
Connectivity structure of bipartite graphs via the KNC-plot
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Approximate continuous querying over distributed streams
ACM Transactions on Database Systems (TODS)
Intelligent Data Analysis - Knowlegde Discovery from Data Streams
Approximate mining of frequent patterns on streams
Intelligent Data Analysis - Knowlegde Discovery from Data Streams
Processing top k queries from samples
CoNEXT '06 Proceedings of the 2006 ACM CoNEXT conference
Shape sensitive geometric monitoring
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Interactive mining of frequent itemsets over arbitrary time intervals in a data stream
ADC '08 Proceedings of the nineteenth conference on Australasian database - Volume 75
Mining sequential patterns across time sequences
New Generation Computing
Sketching information divergences
Machine Learning
Sketches for size of join estimation
ACM Transactions on Database Systems (TODS)
Online mining of frequent sets in data streams with error guarantee
Knowledge and Information Systems
Estimating Local Cardinalities in a Multidimensional Multiset
AIMS '07 Proceedings of the 1st international conference on Autonomous Infrastructure, Management and Security: Inter-Domain Management
Memory Efficient Algorithm for Mining Recent Frequent Items in a Stream
RSEISP '07 Proceedings of the international conference on Rough Sets and Intelligent Systems Paradigms
Separator: Sifting Hierarchical Heavy Hitters Accurately from Data Streams
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
A Phrase Recommendation Algorithm Based on Query Stream Mining in Web Search Engines
Algorithms and Models for the Web-Graph
Finding Frequent Items over General Update Streams
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Efficiently Discovering Recent Frequent Items in Data Streams
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Clustering Distributed Sensor Data Streams
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
SLEUTH: Single-pubLisher attack dEtection Using correlaTion Hunting
Proceedings of the VLDB Endowment
Finding frequent items in data streams
Proceedings of the VLDB Endowment
The eternal sunshine of the sketch data structure
Computer Networks: The International Journal of Computer and Telecommunications Networking
CAM conscious integrated answering of frequent elements and top-k queries over data streams
Proceedings of the 4th international workshop on Data management on new hardware
Information Processing Letters
On Estimating Frequency Moments of Data Streams
APPROX '07/RANDOM '07 Proceedings of the 10th International Workshop on Approximation and the 11th International Workshop on Randomization, and Combinatorial Optimization. Algorithms and Techniques
On Finding Frequent Elements in a Data Stream
APPROX '07/RANDOM '07 Proceedings of the 10th International Workshop on Approximation and the 11th International Workshop on Randomization, and Combinatorial Optimization. Algorithms and Techniques
Multi-query optimization for sketch-based estimation
Information Systems
ODMCA: An adaptive data mining control algorithm in multicarrier networks
Computer Communications
Mining frequent closed itemsets from a landmark window over online data streams
Computers & Mathematics with Applications
Frequent items in streaming data: An experimental evaluation of the state-of-the-art
Data & Knowledge Engineering
The average-case complexity of counting distinct elements
Proceedings of the 12th International Conference on Database Theory
HIDS: a multifunctional generator of hierarchical data streams
ACM SIGMIS Database
Numerical linear algebra in the streaming model
Proceedings of the forty-first annual ACM symposium on Theory of computing
Data Mining and Knowledge Discovery
Space-optimal heavy hitters with strong error bounds
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Finding the frequent items in streams of data
Communications of the ACM - A View of Parallel Computing
A Note on Estimating Hybrid Frequency Moment of Data Streams
AAIM '09 Proceedings of the 5th International Conference on Algorithmic Aspects in Information and Management
Deterministically Estimating Data Stream Frequencies
COCOA '09 Proceedings of the 3rd International Conference on Combinatorial Optimization and Applications
The Frequent Items Problem, under Polynomial Decay, in the Streaming Model
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Scalable proximity estimation and link prediction in online social networks
Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference
Mining data streams with periodically changing distributions
Proceedings of the 18th ACM conference on Information and knowledge management
Incorporating prediction models in the SelfLet framework: a plugin approach
Proceedings of the Fourth International ICST Conference on Performance Evaluation Methodologies and Tools
Precomputing search features for fast and accurate query classification
Proceedings of the third ACM international conference on Web search and data mining
Flooding attacks detection and victim identification over high speed networks
GIIS'09 Proceedings of the Second international conference on Global Information Infrastructure Symposium
Methods for finding frequent items in data streams
The VLDB Journal — The International Journal on Very Large Data Bases
Discovering correlated items in data streams
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Sketching information divergences
COLT'07 Proceedings of the 20th annual conference on Learning theory
Finding frequent elements in non-bursty streams
ESA'07 Proceedings of the 15th annual European conference on Algorithms
Finding frequent items in data streams using ESBF
PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
CLAIM: an efficient method for relaxed frequent closed itemsets mining over stream data
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Aggregate computation over data streams
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Sequential sparse matching pursuit
Allerton'09 Proceedings of the 47th annual Allerton conference on Communication, control, and computing
HiFIND: A high-speed flow-level intrusion detection approach with DoS resiliency
Computer Networks: The International Journal of Computer and Telecommunications Networking
An online framework for catching top spreaders and scanners
Computer Networks: The International Journal of Computer and Telecommunications Networking
Proceedings of the forty-second ACM symposium on Theory of computing
Approximate sparse recovery: optimizing time and measurements
Proceedings of the forty-second ACM symposium on Theory of computing
Fast Manhattan sketches in data streams
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Lower bounds on frequency estimation of data streams
CSR'08 Proceedings of the 3rd international conference on Computer science: theory and applications
The frequent items problem, under polynomial decay, in the streaming model
Theoretical Computer Science
Approximating sliding windows by cyclic tree-like histograms for efficient range queries
Data & Knowledge Engineering
Mining discriminative items in multiple data streams
World Wide Web
Space-optimal heavy hitters with strong error bounds
ACM Transactions on Database Systems (TODS)
International Journal of Network Management
Action prediction of opponents in MMORPG using data stream mining approach with heuristic motions
ISTASC'10 Proceedings of the 10th WSEAS international conference on Systems theory and scientific computation
Lightweight problem determination in DBMSs using data stream analysis techniques
Proceedings of the 2010 Conference of the Center for Advanced Studies on Collaborative Research
Clustering distributed sensor data streams using local processing and reduced communication
Intelligent Data Analysis - Ubiquitous Knowledge Discovery
K-median clustering, model-based compressive sensing, and sparse recovery for earth mover distance
Proceedings of the forty-third annual ACM symposium on Theory of computing
Near-optimal private approximation protocols via a black box transformation
Proceedings of the forty-third annual ACM symposium on Theory of computing
Fast moment estimation in data streams in optimal space
Proceedings of the forty-third annual ACM symposium on Theory of computing
Mining frequent itemsets over distributed data streams by continuously maintaining a global synopsis
Data Mining and Knowledge Discovery
Compressive sensing with local geometric features
Proceedings of the twenty-seventh annual symposium on Computational geometry
Space-efficient tracking of persistent items in a massive data stream
Proceedings of the 5th ACM international conference on Distributed event-based system
Classification rule mining for a stream of perennial objects
RuleML'2011 Proceedings of the 5th international conference on Rule-based reasoning, programming, and applications
Sparse recovery with partial support knowledge
APPROX'11/RANDOM'11 Proceedings of the 14th international workshop and 15th international conference on Approximation, randomization, and combinatorial optimization: algorithms and techniques
Data-driven modeling and analysis of online social networks
WAIM'11 Proceedings of the 12th international conference on Web-age information management
Discovering trending phrases on information streams
Proceedings of the 20th ACM international conference on Information and knowledge management
Search method of time sensitive frequent itemsets in data streams
CIARP'06 Proceedings of the 11th Iberoamerican conference on Progress in Pattern Recognition, Image Analysis and Applications
Dynamically mining frequent patterns over online data streams
ISPA'05 Proceedings of the Third international conference on Parallel and Distributed Processing and Applications
A new algorithm for long flows Statistics—MGCBF
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part IV
Finding longest increasing and common subsequences in streaming data
COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics
GCC'05 Proceedings of the 4th international conference on Grid and Cooperative Computing
Fast approximate wavelet tracking on streams
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Efficient computation of frequent and top-k elements in data streams
ICDT'05 Proceedings of the 10th international conference on Database Theory
Optimal bounds for Johnson-Lindenstrauss transforms and streaming problems with sub-constant error
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Practical algorithms for tracking database join sizes
FSTTCS '05 Proceedings of the 25th international conference on Foundations of Software Technology and Theoretical Computer Science
Error-adaptive and time-aware maintenance of frequency counts over data streams
WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Statistical supports for frequent itemsets on data streams
MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition
False-Negative frequent items mining from data streams with bursting
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Estimating hybrid frequency moments of data streams
Journal of Combinatorial Optimization
Suppressing redundancy in wireless sensor network traffic
DCOSS'10 Proceedings of the 6th IEEE international conference on Distributed Computing in Sensor Systems
A false negative approach to mining frequent itemsets from high speed transactional data streams
Information Sciences: an International Journal
A scalable supervised algorithm for dimensionality reduction on streaming data
Information Sciences: an International Journal
Rectangle-efficient aggregation in spatial data streams
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
Efficient mining of frequent items coupled with weight and /or support over progressive databases
ICDEM'10 Proceedings of the Second international conference on Data Engineering and Management
International Journal of Sensor Networks
Don't let the negatives bring you down: sampling from streams of signed updates
Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
Differentially private summaries for sparse data
Proceedings of the 15th International Conference on Database Theory
Sketch-based querying of distributed sliding-window data streams
Proceedings of the VLDB Endowment
SIAM Journal on Computing
Approximate Sparse Recovery: Optimizing Time and Measurements
SIAM Journal on Computing
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
Recent frequent itemsets mining over data streams
Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
CR-PRECIS: a deterministic summary structure for update data streams
ESCAPE'07 Proceedings of the First international conference on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies
Analysis and design of internet monitoring system on public opinion based on cloud computing and NLP
WISM'12 Proceedings of the 2012 international conference on Web Information Systems and Mining
Sketching via hashing: from heavy hitters to compressed sensing to sparse fourier transform
Proceedings of the 32nd symposium on Principles of database systems
Quantiles over data streams: an experimental study
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Decision support based needs assessment for cancer patients
HIKM '11 Proceedings of the Fourth Australasian Workshop on Health Informatics and Knowledge Management - Volume 120
Real time processing of data from patient biodevices
HIKM '11 Proceedings of the Fourth Australasian Workshop on Health Informatics and Knowledge Management - Volume 120
Optimal Bounds for Johnson-Lindenstrauss Transforms and Streaming Problems with Subconstant Error
ACM Transactions on Algorithms (TALG) - Special Issue on SODA'11
High throughput heavy hitter aggregation for modern SIMD processors
Proceedings of the Ninth International Workshop on Data Management on New Hardware
Fast and scalable polynomial kernels via explicit feature maps
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Homomorphic fingerprints under misalignments: sketching edit and shift distances
Proceedings of the forty-fifth annual ACM symposium on Theory of computing
Spreader classification based on optimal dynamic bit sharing
IEEE/ACM Transactions on Networking (TON)
ℓ2/ℓ2-Foreach sparse recovery with low risk
ICALP'13 Proceedings of the 40th international conference on Automata, Languages, and Programming - Volume Part I
Identifying streaming frequent items in ad hoc time windows
Data & Knowledge Engineering
Sketch-based geometric monitoring of distributed stream queries
Proceedings of the VLDB Endowment
A methodological overview on anomaly detection
DataTraffic Monitoring and Analysis
Stream mining on univariate uncertain data
Applied Intelligence
Indexing Word Sequences for Ranked Retrieval
ACM Transactions on Information Systems (TOIS)
Mining frequent items in data stream using time fading model
Information Sciences: an International Journal
Efficient frequent itemset mining methods over time-sensitive streams
Knowledge-Based Systems
Hi-index | 0.00 |
We present a 1-pass algorithm for estimating the most frequent items in a data stream using very limited storage space. Our method relies on a novel data structure called a count sketch, which allows us to estimate the frequencies of all the items in the stream. Our algorithm achieves better space bounds than the previous best known algorithms for this problem for many natural distributions on the item frequencies. In addition, our algorithm leads directly to a 2-pass algorithm for the problem of estimating the items with the largest (absolute) change in frequency between two data streams. To our knowledge, this problem has not been previously studied in the literature.