Optimal histograms for limiting worst-case error propagation in the size of join results
ACM Transactions on Database Systems (TODS)
Balancing histogram optimality and practicality for query result size estimation
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Approximate medians and other quantiles in one pass and with limited memory
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Random sampling for histogram construction: how much is enough?
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data cube approximation and histograms via wavelets
Proceedings of the seventh international conference on Information and knowledge management
Approximate computation of multidimensional aggregates of sparse data using wavelets
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Join synopses for approximate query answering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
The Aqua approximate query answering system
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Progressive vector transmission
Proceedings of the 7th ACM international symposium on Advances in geographic information systems
Optimal histograms for hierarchical range queries (extended abstract)
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Histogram-Based Approximation of Set-Valued Query-Answers
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Universality of Serial Histograms
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Estimation of Query-Result Distribution and its Application in Parallel-Join Load Balancing
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Selectivity Estimation Without the Attribute Value Independence Assumption
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Fast Incremental Maintenance of Approximate Histograms
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Fast Approximate Answers to Aggregate Queries on a Data Cube
SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
Mining Deviants in a Time Series Database
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
An Approximate L1-Difference Algorithm for Massive Data Streams
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Stable distributions, pseudorandom generators, embeddings and data stream computation
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Approximate counting of inversions in a data stream
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Fast, small-space algorithms for approximate histogram maintenance
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Characterizing memory requirements for queries over continuous data streams
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Reductions in streaming algorithms, with an application to counting triangles in graphs
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Maintaining stream statistics over sliding windows: (extended abstract)
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Processing complex aggregate queries over data streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Dynamic multidimensional histograms
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Distributed streams algorithms for sliding windows
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Fast incremental maintenance of approximate histograms
ACM Transactions on Database Systems (TODS)
Temporal and spatio-temporal aggregations over data streams using multiple time granularities
Information Systems - Special issue: Best papers from EDBT 2002
Pass efficient algorithms for approximating large matrices
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Temporal Aggregation over Data Streams Using Multiple Granularities
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Approximate Query Processing: Taming the TeraBytes
Proceedings of the 27th International Conference on Very Large Data Bases
Histogramming Data Streams with Fast Per-Item Processing
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
Comparing Data Streams Using Hamming Norms (How to Zero In)
IEEE Transactions on Knowledge and Data Engineering
Better streaming algorithms for clustering problems
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
Approximate join processing over data streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Characterizing memory requirements for queries over continuous data streams
ACM Transactions on Database Systems (TODS)
Continuously Maintaining Quantile Summaries of the Most Recent N Elements over a Data Stream
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Holistic UDAFs at streaming speeds
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Diamond in the rough: finding Hierarchical Heavy Hitters in multi-dimensional data
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Approximating extent measures of points
Journal of the ACM (JACM)
Fast range query estimation by N-level tree histograms
Data & Knowledge Engineering
Semantic Approximation of Data Stream Joins
IEEE Transactions on Knowledge and Data Engineering
Spatiotemporal Aggregate Computation: A Survey
IEEE Transactions on Knowledge and Data Engineering
A Distributed System for Answering Range Queries on Sensor Network Data
PERCOMW '05 Proceedings of the Third IEEE International Conference on Pervasive Computing and Communications Workshops
Stabbing the Sky: Efficient Skyline Computation over Sliding Windows
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Duplicate detection in click streams
WWW '05 Proceedings of the 14th international conference on World Wide Web
On joining and caching stochastic streams
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Graph distances in the streaming model: the value of space
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Space efficiency in synopsis construction algorithms
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Content-based routing: different plans for different data
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Approximation algorithms for wavelet transform coding of data streams
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Maintaining significant stream statistics over sliding windows
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Matrix approximation and projective clustering via volume sampling
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
The space complexity of pass-efficient algorithms for clustering
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Research issues in data stream association rule mining
ACM SIGMOD Record
Approximate Processing of Massive Continuous Quantile Queries over High-Speed Data Streams
IEEE Transactions on Knowledge and Data Engineering
On graph problems in a semi-streaming model
Theoretical Computer Science - Automata, languages and programming: Algorithms and complexity (ICALP-A 2004)
Approximation and streaming algorithms for histogram construction problems
ACM Transactions on Database Systems (TODS)
Discretization from data streams: applications to histograms and data mining
Proceedings of the 2006 ACM symposium on Applied computing
Counting triangles in data streams
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A simpler and more efficient deterministic scheme for finding frequent items over sliding windows
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
CFI-Stream: mining closed frequent itemsets in data streams
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
An integrated efficient solution for computing frequent and top-k elements in data streams
ACM Transactions on Database Systems (TODS)
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Error minimization in approximate range aggregates
Data & Knowledge Engineering
Incremental discretization, application to data with concept drift
Proceedings of the 2007 ACM symposium on Applied computing
A Note on Linear Time Algorithms for Maximum Error Histograms
IEEE Transactions on Knowledge and Data Engineering
Continuously maintaining order statistics over data streams: extended abstract
ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
Comparing data streams using Hamming norms (how to zero in)
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Tuple routing strategies for distributed eddies
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Finding hierarchical heavy hitters in data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Processing sliding window multi-joins in continuous queries over data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
REHIST: relative error histogram construction algorithms
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Memory-limited execution of windowed stream joins
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Finding hierarchical heavy hitters in streaming data
ACM Transactions on Knowledge Discovery from Data (TKDD)
Enhancing histograms by tree-like bucket indices
The VLDB Journal — The International Journal on Very Large Data Bases
Wavelet synopsis for hierarchical range queries with workloads
The VLDB Journal — The International Journal on Very Large Data Bases
Constructing comprehensive summaries of large event sequences
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Event-Based Compression and Mining of Data Streams
KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part II
Summarizing Distributed Data Streams for Storage in Data Warehouses
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Multiple-Objective Compression of Data Cubes in Cooperative OLAP Environments
ADBIS '08 Proceedings of the 12th East European conference on Advances in Databases and Information Systems
The VLDB Journal — The International Journal on Very Large Data Bases
Conceptual modeling rules extracting for data streams
Knowledge-Based Systems
A new approach to building histogram for selectivity estimation in query processing optimization
Computers & Mathematics with Applications
Tight results for clustering and summarizing data streams
Proceedings of the 12th International Conference on Database Theory
Continuous privacy preserving publishing of data streams
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Self-tuning query mesh for adaptive multi-route query processing
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Novelty Detection from Evolving Complex Data Streams with Time Windows
ISMIS '09 Proceedings of the 18th International Symposium on Foundations of Intelligent Systems
History Guided Low-Cost Change Detection in Streams
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Constructing comprehensive summaries of large event sequences
ACM Transactions on Knowledge Discovery from Data (TKDD)
Statistical structures for Internet-scale data management
The VLDB Journal — The International Journal on Very Large Data Bases
Location-dependent query processing: Where we are and where we are heading
ACM Computing Surveys (CSUR)
Heuristic Bayesian Segmentation for Discovery of Coexpressed Genes within Genomic Regions
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Recurrent predictive models for sequence segmentation
IDA'07 Proceedings of the 7th international conference on Intelligent data analysis
Estimating clustering indexes in data streams
ESA'07 Proceedings of the 15th annual European conference on Algorithms
Event-based lossy compression for effective and efficient OLAP over data streams
Data & Knowledge Engineering
An algorithmic approach to event summarization
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
DEMS: a data mining based technique to handle missing data in mobile sensor network applications
Proceedings of the Seventh International Workshop on Data Management for Sensor Networks
Fast and accurate computation of equi-depth histograms over data streams
Proceedings of the 14th International Conference on Extending Database Technology
Workload-optimal histograms on streams
ESA'05 Proceedings of the 13th annual European conference on Algorithms
Finding longest increasing and common subsequences in streaming data
COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics
Monitoring continuous band-join queries over dynamic data
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
Adaptively detecting aggregation bursts in data streams
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Adaptive sampling and fast low-rank matrix approximation
APPROX'06/RANDOM'06 Proceedings of the 9th international conference on Approximation Algorithms for Combinatorial Optimization Problems, and 10th international conference on Randomization and Computation
Outlier respecting points approximation
ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
Non-linear data stream compression: foundations and theoretical results
HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part I
On pre-processing algorithms for data stream
ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part II
Improved sketching of hamming distance with error correcting
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Hi-index | 0.00 |
Histograms have been used widely to capture data distribution, to represent the data by a small number of step functions. Dynamic programming algorithms which provide optimal construction of these histograms exist, albeit running in quadratic time and linear space. In this paper we provide linear time construction of 1 + &egr; approximation of optimal histograms, running in polylogarithmic space.Our results extend to the context of data-streams, and in fact generalize to give 1 + &egr; approximation of several problems in data-streams which require partitioning the index set into intervals. The only assumptions required are that the cost of an interval is monotonic under inclusion (larger interval has larger cost) and that the cost can be computed or approximated in small space. This exhibits a nice class of problems for which we can have near optimal data-stream algorithms.