The space complexity of approximating the frequency moments
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Communication complexity
Size-estimation framework with applications to transitive closure and reachability
Journal of Computer and System Sciences
Min-wise independent permutations (extended abstract)
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
Tracking join and self-join sizes in limited storage
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Synopsis data structures for massive data sets
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
SFCS '83 Proceedings of the 24th Annual Symposium on Foundations of Computer Science
Testing and spot-checking of data streams (extended abstract)
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
On computing correlated aggregates over continual data streams
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Estimating simple functions on the union of data streams
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Sampling algorithms: lower bounds and applications
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
External memory algorithms and data structures: dealing with massive data
ACM Computing Surveys (CSUR)
Mining data streams under block evolution
ACM SIGKDD Explorations Newsletter
Space lower bounds for distance approximation in the data stream model
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Approximate counting of inversions in a data stream
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Fast, small-space algorithms for approximate histogram maintenance
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Characterizing memory requirements for queries over continuous data streams
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Reductions in streaming algorithms, with an application to counting triangles in graphs
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Maintaining stream statistics over sliding windows: (extended abstract)
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Processing complex aggregate queries over data streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Distributed streams algorithms for sliding windows
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Pass efficient algorithms for approximating large matrices
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries
Proceedings of the 27th International Conference on Very Large Data Bases
Permutation Editing and Matching via Embeddings
ICALP '01 Proceedings of the 28th International Colloquium on Automata, Languages and Programming,
Secure Multiparty Computation of Approximations
ICALP '01 Proceedings of the 28th International Colloquium on Automata, Languages and Programming,
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Complexity of Comparing Hidden Markov Models
ISAAC '01 Proceedings of the 12th International Symposium on Algorithms and Computation
An Approximate Lp-Difference Algorithm for Massive Data Streams
STACS '00 Proceedings of the 17th Annual Symposium on Theoretical Aspects of Computer Science
Frequency Estimation of Internet Packet Streams with Limited Space
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
Estimating Rarity and Similarity over Data Stream Windows
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
Correlating XML data streams using tree-edit distance embeddings
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
Comparing Data Streams Using Hamming Norms (How to Zero In)
IEEE Transactions on Knowledge and Data Engineering
One-Pass Wavelet Decompositions of Data Streams
IEEE Transactions on Knowledge and Data Engineering
Efficient Approximation of Correlated Sums on Data Streams
IEEE Transactions on Knowledge and Data Engineering
Issues in data stream management
ACM SIGMOD Record
Handbook of massive data sets
Better streaming algorithms for clustering problems
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
On finding common neighborhoods in massive graphs
Theoretical Computer Science
Processing set expressions over continuous update streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Computing Highly Specific and Mismatch Tolerant Oligomers Efficiently
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Characterizing memory requirements for queries over continuous data streams
ACM Transactions on Database Systems (TODS)
Finding frequent items in data streams
Theoretical Computer Science - Special issue on automata, languages and programming
Spatially-decaying aggregation over a network: model and algorithms
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Tracking set-expression cardinalities over continuous update streams
The VLDB Journal — The International Journal on Very Large Data Bases
Adaptive sampling for geometric problems over data streams
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Duplicate detection in click streams
WWW '05 Proceedings of the 14th international conference on World Wide Web
XML stream processing using tree-edit distance embeddings
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
Improved range-summable random variable construction algorithms
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
An improved data stream summary: the count-min sketch and its applications
Journal of Algorithms
What's new: finding significant differences in network data streams
IEEE/ACM Transactions on Networking (TON)
Maintaining time-decaying stream aggregates
Journal of Algorithms
Stable distributions, pseudorandom generators, embeddings, and data stream computation
Journal of the ACM (JACM)
DSM-PLW: single-pass mining of path traversal patterns over streaming web click-sequences
Computer Networks: The International Journal of Computer and Telecommunications Networking - Web dynamics
Spatial scan statistics: approximations and performance study
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
An integrated efficient solution for computing frequent and top-k elements in data streams
ACM Transactions on Database Systems (TODS)
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Spatially-decaying aggregation over a network
Journal of Computer and System Sciences
Sketching probabilistic data streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Very sparse stable random projections for dimension reduction in lα (0
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Comparing data streams using Hamming norms (how to zero in)
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Adaptive sampling for geometric problems over data streams
Computational Geometry: Theory and Applications
Estimators and tail bounds for dimension reduction in lα (0
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Multi-query optimization for sketch-based estimation
Information Systems
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
ODMCA: An adaptive data mining control algorithm in multicarrier networks
Computer Communications
Leveraging discarded samples for tighter estimation of multiple-set aggregates
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Coordinated weighted sampling for estimating aggregates over multiple weight assignments
Proceedings of the VLDB Endowment
Maintaining time-decaying stream aggregates
Journal of Algorithms
Application of Ɛ-testers algorithms under sketch and streaming calculation model in robot navigation
WSEAS Transactions on Computers
An efficient algorithm for finding similar short substrings from large scale string data
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Measuring independence of datasets
Proceedings of the forty-second ACM symposium on Theory of computing
Proceedings of the forty-second ACM symposium on Theory of computing
Approximating sliding windows by cyclic tree-like histograms for efficient range queries
Data & Knowledge Engineering
On the Implementation of Huge Random Objects
SIAM Journal on Computing
Finding longest increasing and common subsequences in streaming data
COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics
GCC'05 Proceedings of the 4th international conference on Grid and Cooperative Computing
On approximation algorithms for data mining applications
Efficient Approximation and Online Algorithms
A false negative approach to mining frequent itemsets from high speed transactional data streams
Information Sciences: an International Journal
Improved sketching of hamming distance with error correcting
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Streaming algorithms measured in terms of the computed quantity
COCOON'07 Proceedings of the 13th annual international conference on Computing and Combinatorics
Testing Closeness of Discrete Distributions
Journal of the ACM (JACM)
Hi-index | 0.00 |
We give a space-efficient, one-pass algorithm for approximating the L1 difference \math between two functions, when the function values ai and bi are given as data streams, and their order is chosen by an adversary. Our main technical innovation is a method of constructing families {Vj} of limited-independence random variables that are /range-summable/, by which we mean that the \math for \math is computable in time polylog(c), for all seeds s. These random-variable families may be of interest outside our current application domain, i.e., massive data streams generated by communication networks. Our L1-difference algorithm can be viewed as a ``sketching'' algorithm, in the sense of [Broder, Charikar, Frieze, and Mitzenmacher, STOC '98, pp. 327-336], and our algorithm performs better than that of Broder et al. when used to approximate the symmetric difference of two sets with small symmetric difference.