Pseudorandom generators for space-bounded computations
STOC '90 Proceedings of the twenty-second annual ACM symposium on Theory of computing
Self-tuning histograms: building histograms without looking at data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Multi-dimensional selectivity estimation using compressed histogram information
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
The space complexity of approximating the frequency moments
Journal of Computer and System Sciences
Synopsis data structures for massive data sets
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Dynamic multidimensional histograms
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Dynamic Maintenance of Wavelet-Based Histograms
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries
Proceedings of the 27th International Conference on Very Large Data Bases
Fast Incremental Maintenance of Approximate Histograms
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
An Approximate L1-Difference Algorithm for Massive Data Streams
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Stable distributions, pseudorandom generators, embeddings and data stream computation
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Near-optimal sparse fourier representations via sampling
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Similarity estimation techniques from rounding algorithms
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Dynamic multidimensional histograms
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Fast incremental maintenance of approximate histograms
ACM Transactions on Database Systems (TODS)
Pass efficient algorithms for approximating large matrices
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Approximation of functions over redundant dictionaries using coherence
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Estimating Rarity and Similarity over Data Stream Windows
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
What's hot and what's not: tracking most frequent items dynamically
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
One-Pass Wavelet Decompositions of Data Streams
IEEE Transactions on Knowledge and Data Engineering
Better streaming algorithms for clustering problems
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
Characterizing memory requirements for queries over continuous data streams
ACM Transactions on Database Systems (TODS)
Finding frequent items in data streams
Theoretical Computer Science - Special issue on automata, languages and programming
Algorithms for dynamic geometric problems over data streams
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
An information statistics approach to data stream and communication complexity
Journal of Computer and System Sciences - Special issue on FOCS 2002
Proceedings of the 4th ACM SIGCOMM conference on Internet measurement
Framework and algorithms for trend analysis in massive temporal data sets
Proceedings of the thirteenth ACM international conference on Information and knowledge management
What's hot and what's not: tracking most frequent items dynamically
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
XML stream processing using tree-edit distance embeddings
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
PADS: a domain-specific language for processing ad hoc data
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Join-distinct aggregate estimation over update streams
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Graph distances in the streaming model: the value of space
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Improved range-summable random variable construction algorithms
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Domain-Driven Data Synopses for Dynamic Quantiles
IEEE Transactions on Knowledge and Data Engineering
An improved data stream summary: the count-min sketch and its applications
Journal of Algorithms
Wavelet synopsis for data streams: minimizing non-euclidean error
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Approximation algorithms for wavelet transform coding of data streams
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Trading off space for passes in graph streaming problems
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Streaming and sublinear approximation of entropy and information distances
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
The space complexity of pass-efficient algorithms for clustering
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
What's new: finding significant differences in network data streams
IEEE/ACM Transactions on Networking (TON)
On graph problems in a semi-streaming model
Theoretical Computer Science - Automata, languages and programming: Algorithms and complexity (ICALP-A 2004)
Approximation and streaming algorithms for histogram construction problems
ACM Transactions on Database Systems (TODS)
PADS: an end-to-end system for processing ad hoc data
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Stable distributions, pseudorandom generators, embeddings, and data stream computation
Journal of the ACM (JACM)
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
How to summarize the universe: dynamic maintenance of quantiles
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Reverse nearest neighbor aggregates over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
The history of histograms (abridged)
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Bloom histogram: path selectivity estimation for XML data with updates
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
REHIST: relative error histogram construction algorithms
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Memory-limited execution of windowed stream joins
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Reversible sketches: enabling monitoring and analysis over high-speed data streams
IEEE/ACM Transactions on Networking (TON)
Ad-hoc top-k query answering for data streams
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Detecting attribute dependencies from query feedback
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Explicit constructions for compressed sensing of sparse signals
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Histograms based on the minimum description length principle
The VLDB Journal — The International Journal on Very Large Data Bases
Finding popular categories for RFID tags
Proceedings of the 9th ACM international symposium on Mobile ad hoc networking and computing
Hierarchical synopses with optimal error guarantees
ACM Transactions on Database Systems (TODS)
Wavelet synopsis for hierarchical range queries with workloads
The VLDB Journal — The International Journal on Very Large Data Bases
Multiple Pass Streaming Algorithms for Learning Mixtures of Distributions in ${\mathbb R}^d$
ALT '07 Proceedings of the 18th international conference on Algorithmic Learning Theory
The VLDB Journal — The International Journal on Very Large Data Bases
Implementing Huge Sparse Random Graphs
APPROX '07/RANDOM '07 Proceedings of the 10th International Workshop on Approximation and the 11th International Workshop on Randomization, and Combinatorial Optimization. Algorithms and Techniques
Multiplicative synopses for relative-error metrics
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Private multiparty sampling and approximation of vector combinations
Theoretical Computer Science
Multiple pass streaming algorithms for learning mixtures of distributions in Rd
Theoretical Computer Science
Deterministically Estimating Data Stream Frequencies
COCOA '09 Proceedings of the 3rd International Conference on Combinatorial Optimization and Applications
Incremental tracking of multiple quantiles for network monitoring in cellular networks
Proceedings of the 1st ACM workshop on Mobile internet through cellular networks
Trading off space for passes in graph streaming problems
ACM Transactions on Algorithms (TALG)
Fast and effective histogram construction
Proceedings of the 18th ACM conference on Information and knowledge management
Event-based lossy compression for effective and efficient OLAP over data streams
Data & Knowledge Engineering
Sequential sparse matching pursuit
Allerton'09 Proceedings of the 47th annual Allerton conference on Communication, control, and computing
Fast private norm estimation and heavy hitters
TCC'08 Proceedings of the 5th conference on Theory of cryptography
Tracking quantiles of network data streams with dynamic operations
INFOCOM'10 Proceedings of the 29th conference on Information communications
Adapting parallel algorithms to the W-Stream model, with applications to graph problems
Theoretical Computer Science
On the exact space complexity of sketching and streaming small norms
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
On the Implementation of Huge Random Objects
SIAM Journal on Computing
Beyond simple aggregates: indexing for summary queries
Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Analyzing graph structure via linear measurements
Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
Combinatorial algorithms for compressed sensing
SIROCCO'06 Proceedings of the 13th international conference on Structural Information and Communication Complexity
Workload-optimal histograms on streams
ESA'05 Proceedings of the 13th annual European conference on Algorithms
Finding longest increasing and common subsequences in streaming data
COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics
Fast approximate wavelet tracking on streams
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Distributed similarity estimation using derived dimensions
The VLDB Journal — The International Journal on Very Large Data Bases
Subquadratic algorithms for workload-aware haar wavelet synopses
FSTTCS '05 Proceedings of the 25th international conference on Foundations of Software Technology and Theoretical Computer Science
Adaptively detecting aggregation bursts in data streams
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Density estimation for spatial data streams
SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
Data stream synopsis using saintetiq
FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
Approximating and testing k-histogram distributions in sub-linear time
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
Non-linear data stream compression: foundations and theoretical results
HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part I
Survey: Streaming techniques and data aggregation in networks of tiny artefacts
Computer Science Review
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
Adapting parallel algorithms to the W-stream model, with applications to graph problems
MFCS'07 Proceedings of the 32nd international conference on Mathematical Foundations of Computer Science
Improved sketching of hamming distance with error correcting
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Private multiparty sampling and approximation of vector combinations
ICALP'07 Proceedings of the 34th international conference on Automata, Languages and Programming
CR-PRECIS: a deterministic summary structure for update data streams
ESCAPE'07 Proceedings of the First international conference on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies
Streaming algorithms for data in motion
ESCAPE'07 Proceedings of the First international conference on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies
Sketching via hashing: from heavy hitters to compressed sensing to sparse fourier transform
Proceedings of the 32nd symposium on Principles of database systems
Efficient and scalable monitoring and summarization of large probabilistic data
Proceedings of the 2013 Sigmod/PODS Ph.D. symposium on PhD symposium
Identifying streaming frequent items in ad hoc time windows
Data & Knowledge Engineering
Hi-index | 0.00 |
(MATH) A vector A of length N is defined implicitly, via a stream of updates of the form "add 5 to A3." We give a sketching algorithm, that constructs a small sketch from the stream of updates, and a reconstruction algorithm, that produces a B-bucket piecewise-constant representation (histogram) H for A from the sketch, such that ||A—H||&xie;(1+&egr;)||A—Hopt||, where the error ||A—H|| is either $\ell_1$ (absolute) or $\ell_2$ (root-mean-square) error. The time to process a single update, time to reconstruct the histogram, and size of the sketch are each bounded by poly(B,log(N),log||A,1/&egr;. Our result is obtained in two steps. First we obtain what we call a robust histogram approximation for A, a histogram such that adding a small number of buckets does not help improve the representation quality significantly. From the robust histogram, we cull a histogram of desired accruacy and B buckets in the second step. This technique also provides similar results for Haar wavelet representations, under $\ell_2$ error. Our results have applications in summarizing data distributions fast and succinctly even in distributed settings.