Probabilistic counting algorithms for data base applications
Journal of Computer and System Sciences
Empirically derived analytic models of wide-area TCP connections
IEEE/ACM Transactions on Networking (TON)
Randomized algorithms
The space complexity of approximating the frequency moments
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Tracking join and self-join sizes in limited storage
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A small approximately min-wise independent family of hash functions
Journal of Algorithms
Estimating simple functions on the union of data streams
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Reductions in streaming algorithms, with an application to counting triangles in graphs
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Querying and mining data streams: you only get one look a tutorial
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
New directions in traffic measurement and accounting
Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications
Observed structure of addresses in IP traffic
Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurment
Data streams: algorithms and applications
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports
Proceedings of the 27th International Conference on Very Large Data Bases
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Counting Distinct Elements in a Data Stream
RANDOM '02 Proceedings of the 6th International Workshop on Randomization and Approximation Techniques
Estimating Rarity and Similarity over Data Stream Windows
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
Comparing Data Streams Using Hamming Norms (How to Zero In)
IEEE Transactions on Knowledge and Data Engineering
Processing set expressions over continuous update streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Gigascope: a stream database for network applications
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
STREAM: the stanford stream data manager (demonstration description)
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Synopsis diffusion for robust aggregation in sensor networks
SenSys '04 Proceedings of the 2nd international conference on Embedded networked sensor systems
Graph distances in the streaming model: the value of space
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
An improved data stream summary: the count-min sketch and its applications
Journal of Algorithms
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Data stream query processing: a tutorial
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Secure distributed data-mining and its application to large-scale network measurements
ACM SIGCOMM Computer Communication Review
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Streaming in a connected world: querying and tracking distributed data streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Continuously maintaining order statistics over data streams: extended abstract
ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
Time-decaying sketches for sensor data aggregation
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
A near-optimal algorithm for computing the entropy of a stream
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Estimating PageRank on graph streams
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Time-decaying aggregates in out-of-order streams
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Estimating Hybrid Frequency Moments of Data Streams
FAW '08 Proceedings of the 2nd annual international workshop on Frontiers in Algorithmics
Overcoming the l1 non-embeddability barrier: algorithms for product metrics
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Robust approximate aggregation in sensor data management systems
ACM Transactions on Database Systems (TODS)
Estimating the confidence of conditional functional dependencies
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
A Note on Estimating Hybrid Frequency Moment of Data Streams
AAIM '09 Proceedings of the 5th International Conference on Algorithmic Aspects in Information and Management
Statistical structures for Internet-scale data management
The VLDB Journal — The International Journal on Very Large Data Bases
Aggregate computation over data streams
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
A near-optimal algorithm for estimating the entropy of a stream
ACM Transactions on Algorithms (TALG)
Fast Manhattan sketches in data streams
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Information complexity: a tutorial
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fully decentralized computation of aggregates over data streams
Proceedings of the First International Workshop on Novel Data Stream Pattern Mining Techniques
1-pass relative-error Lp-sampling with applications
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
APPROX/RANDOM'10 Proceedings of the 13th international conference on Approximation, and 14 the International conference on Randomization, and combinatorial optimization: algorithms and techniques
Time-decaying Sketches for Robust Aggregation of Sensor Data
SIAM Journal on Computing
Fully decentralized computation of aggregates over data streams
ACM SIGKDD Explorations Newsletter
Estimating PageRank on graph streams
Journal of the ACM (JACM)
Buyback problem: approximate matroid intersection with cancellation costs
ICALP'11 Proceedings of the 38th international colloquim conference on Automata, languages and programming - Volume Part I
Streaming algorithms with one-sided estimation
APPROX'11/RANDOM'11 Proceedings of the 14th international workshop and 15th international conference on Approximation, randomization, and combinatorial optimization: algorithms and techniques
gSketch: on query estimation in graph streams
Proceedings of the VLDB Endowment
Analyzing graph structure via linear measurements
Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
Estimating hybrid frequency moments of data streams
Journal of Combinatorial Optimization
Survey: Streaming techniques and data aggregation in networks of tiny artefacts
Computer Science Review
Spreader classification based on optimal dynamic bit sharing
IEEE/ACM Transactions on Networking (TON)
A modelling framework for social media monitoring
International Journal of Web Engineering and Technology
Hi-index | 0.00 |
The challenge of monitoring massive amounts of data generated by communication networks has led to the interest in data stream processing. We study streams of edges in massive communication multigraphs, defined by (source, destination) pairs. The goal is to compute properties of the underlying graph while using small space (much smaller than the number of communicants), and to avoid bias introduced because some edges may appear many times, while others are seen only once. We give results for three fundamental problems on multigraph degree sequences: estimating frequency moments of degrees, finding the heavy hitter degrees, and computing range sums of degree values. In all cases we are able to show space bounds for our summarizing algorithms that are significantly smaller than storing complete information. We use a variety of data stream methods: sketches, sampling, hashing and distinct counting, but a common feature is that we use cascaded summaries: nesting multiple estimation techniques within one another. In our experimental study, we see that such summaries are highly effective, enabling massive multigraph streams to be effectively summarized to answer queries of interest with high accuracy using only a small amount of space.