Online aggregation

Authors:
Joseph M. Hellerstein;Peter J. Haas;Helen J. Wang
Affiliations:
Computer Science Division, University of California, Berkeley;Almaden Research Center, IBM Research Division;Computer Science Division, University of California, Berkeley
Venue:
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Year:
1997

Citing 23
Cited 273

Processing aggregate relational queries with hard time constraints

SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Error-constrained COUNT query evaluation in relational databases

SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Multiresolution coding techniques for digital television: a review

Multidimensional Systems and Signal Processing
Efficient sampling strategies for relational database operations

ICDT Selected papers of the 4th international conference on Database theory
Query execution techniques for caching expensive methods

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Cost-based optimization for magic: algebra and implementation

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Selectivity and cost estimation for joins based on random sampling

Journal of Computer and System Sciences
Processing queries for first-few answers

CIKM '96 Proceedings of the fifth international conference on Information and knowledge management
Statistical estimators for relational algebra expressions

Proceedings of the seventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
The importance of percent-done progress indicators for computer-human interfaces

CHI '85 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Dataflow query execution in a parallel main-memory environment

PDIS '91 Proceedings of the first international conference on Parallel and distributed information systems
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Implementation techniques for main memory database systems

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
APPROXIMATE: A Query Processor that Produces Monotonically Improving Approximate Answers

IEEE Transactions on Knowledge and Data Engineering
Optimizing Queries with Aggregate Views

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Complex Query Decorrelation

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Tioga-2: A Direct Manipulation Database Visualization Environment

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Hashing Methods and Relational Algebra Operations

VLDB '84 Proceedings of the 10th International Conference on Very Large Data Bases
Aggregate-Query Processing in Data Warehousing Environments

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Eager Aggregation and Lazy Aggregation

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Query processing and optimization in Oracle Rdb

The VLDB Journal — The International Journal on Very Large Data Bases

Exploratory mining and pruning optimizations of constrained associations rules

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Reusing invariants: a new strategy for correlated queries

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Interaction of query evaluation and buffer management for information retrieval

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Incremental distance join algorithms for spatial databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
New sampling-based summary statistics for improving approximate query answers

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
CONTROL: continuous output and navigation technology with refinement on-line

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data cube approximation and histograms via wavelets

Proceedings of the seventh international conference on Information and knowledge management
Tracking join and self-join sizes in limited storage

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Exact and approximate aggregation in constraint query languages

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Online association rule mining

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Approximate computation of multidimensional aggregates of sparse data using wavelets

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
A comparison of selectivity estimators for range queries on metric attributes

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
On random sampling over joins

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Join synopses for approximate query answering

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Ripple joins for online aggregation

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Temporal, geographical and categorical aggregations viewed through coordinated displays: a case study with highway incident data

Proceedings of the 1999 workshop on new paradigms in information visualization and manipulation in conjunction with the eighth ACM internation conference on Information and knowledge management
Density biased sampling: an improved method for data mining and clustering

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Approximating multi-dimensional aggregate range queries over real attributes

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Congressional samples for approximate answering of group-by queries

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Optimal and approximate computation of summary statistics for range aggregates

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On computing correlated aggregates over continual data streams

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Iceberg-cube computation with PC clusters

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A robust, optimization-based approach for approximate answering of aggregate queries

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Progressive approximate aggregate queries with a multi-resolution tree structure

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Mining data streams under block evolution

ACM SIGKDD Explorations Newsletter
Loglinear-Based Quasi Cubes

Journal of Intelligent Information Systems
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient aggregation over objects with extent

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
How to evaluate multiple range-sum queries progressively

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Least expected cost query optimization: what can we expect?

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A scalable hash ripple join algorithm

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Partial results for online query processing

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Dwarf: shrinking the PetaCube

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Design and evaluation of a conit-based continuous consistency model for replicated services

ACM Transactions on Computer Systems (TOCS)
Nonmonotonic reasoning in LDL++

Logic-based artificial intelligence
Query processing of streamed XML data

Proceedings of the eleventh international conference on Information and knowledge management
Informix under CONTROL: Online Query Processing

Data Mining and Knowledge Discovery
Approximate Query Answering Using Data Warehouse Striping

Journal of Intelligent Information Systems - Special issue on data warehousing and knowledge discovery
A retrieval technique for high-dimensional data and partially specified queries

Data & Knowledge Engineering
The cougar approach to in-network query processing in sensor networks

ACM SIGMOD Record
Continuous queries over data streams

ACM SIGMOD Record
Polaris: A System for Query, Analysis, and Visualization of Multidimensional Relational Databases

IEEE Transactions on Visualization and Computer Graphics
Interactive Data Analysis: The Control Project

Computer
Query Rewriting for SWIFT (First) Answers

IEEE Transactions on Knowledge and Data Engineering
Finding Interesting Associations without Support Pruning

IEEE Transactions on Knowledge and Data Engineering
High-dimensional nearest neighbor search with remote data centers

Knowledge and Information Systems
Temporal and spatio-temporal aggregations over data streams using multiple time granularities

Information Systems - Special issue: Best papers from EDBT 2002
Approximated trial and error analysis in scientific databases

Information Systems - Special issue: Best papers from EDBT 2002
Optimizing Scientific Databases for Client Side Data Processing

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Temporal Aggregation over Data Streams Using Multiple Granularities

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
ProPolyne: A Fast Wavelet-Based Algorithm for Progressive Evaluation of Polynomial Range-Sum Queries

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Estimating Range Queries Using Aggregate Data with Integrity Constraints: A Probabilistic Approach

ICDT '01 Proceedings of the 8th International Conference on Database Theory
Probabilistic Optimization of Top N Queries

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Online Feedback for Nested Aggregate Queries with Multi-Threading

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Histogram-Based Approximation of Set-Valued Query-Answers

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Online Dynamic Reordering for Interactive Data Processing

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Using SQL to Build New Aggregates and Extenders for Object- Relational Systems

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Approximate Query Processing Using Wavelets

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Offering a Precision-Performance Tradeoff for Aggregation Queries over Replicated Data

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports

Proceedings of the 27th International Conference on Very Large Data Bases
Approximate Query Processing: Taming the TeraBytes

Proceedings of the 27th International Conference on Very Large Data Bases
XXL - A Library Approach to Supporting Efficient Implementations of Advanced Database Queries

Proceedings of the 27th International Conference on Very Large Data Bases
Value Range Queries on Earth Science Data via Histogram Clustering

TSDM '00 Proceedings of the First International Workshop on Temporal, Spatial, and Spatio-Temporal Data Mining-Revised Papers
Querying and Clustering Very Large Data Sets Using Dynamic Bucketing Approach

WAIM '02 Proceedings of the Third International Conference on Advances in Web-Age Information Management
Key Constraints and Monotonic Aggregates in Deductive Databases

Computational Logic: Logic Programming and Beyond, Essays in Honour of Robert A. Kowalski, Part II
Comparison of Genetic and Tabu Search Algorithms in Multiquery Optimization in Advanced Database Systems

ADVIS '00 Proceedings of the First International Conference on Advances in Information Systems
QoS-Driven Load Shedding on Data Streams

EDBT '02 Proceedings of the Worshops XMLDM, MDDE, and YRWS on XML-Based Data Management and Multimedia Engineering-Revised Papers
Compressed Datacubes for fast OLAP Applications

DaWaK '99 Proceedings of the First International Conference on Data Warehousing and Knowledge Discovery
BEDAWA - A Tool for Generating Sample Data for Data Warehouses

DaWaK 2000 Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery
Supporting Online Queries in ROLAP

DaWaK 2000 Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery
Approximate Query Answering Using Data Warehouse Striping

DaWaK '01 Proceedings of the Third International Conference on Data Warehousing and Knowledge Discovery
A Decathlon in Multidimensional Modeling: Open Issues and Some Solutions

DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Time-Interval Sampling for Improved Estimations in Data Warehouses

DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Limiting Result Cardinalities for Multidatabase Queries Using Histograms

BNCOD 18 Proceedings of the 18th British National Conference on Databases: Advances in Databases
Complex Queries in DHT-based Peer-to-Peer Networks

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Large-Sample and Deterministic Confidence Intervals for Online Aggregation

SSDBM '97 Proceedings of the Ninth International Conference on Scientific and Statistical Database Management
Summary Grids: Building Accurate Multidimensional Histograms

DASFAA '99 Proceedings of the Sixth International Conference on Database Systems for Advanced Applications
Supporting Group-By and Pipelining in Bitmap-Enabled Query Processors

SOFSEM '99 Proceedings of the 26th Conference on Current Trends in Theory and Practice of Informatics on Theory and Practice of Informatics
Performance Analysis of Database Systems

Performance Evaluation: Origins and Directions
Scalable Fault-Tolerant Aggregation in Large Process Groups

DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
User-Defined Aggregates in Database Languages

DBPL '99 Revised Papers from the 7th International Workshop on Database Programming Languages: Research Issues in Structured and Semistructured Database Programming
Fine Grained Replication in Distributed Databases: A Taxonomy and Practical Considerations

DEXA '00 Proceedings of the 11th International Conference on Database and Expert Systems Applications
Processing of Continuous Queries over Unlimited Data Streams

DEXA '02 Proceedings of the 13th International Conference on Database and Expert Systems Applications
Online dynamic reordering

The VLDB Journal — The International Journal on Very Large Data Bases
Progressive evaluation of nested aggregate queries

The VLDB Journal — The International Journal on Very Large Data Bases
Approximate query processing using wavelets

The VLDB Journal — The International Journal on Very Large Data Bases
Exploiting Punctuation Semantics in Continuous Data Streams

IEEE Transactions on Knowledge and Data Engineering
pCube: Update-Efficient Online Aggregation with Progressive Feedback and Error Bounds

SSDBM '00 Proceedings of the 12th International Conference on Scientific and Statistical Database Management
TAG: a Tiny AGgregation service for ad-hoc sensor networks

ACM SIGOPS Operating Systems Review - OSDI '02: Proceedings of the 5th symposium on Operating systems design and implementation
Generalized substring selectivity estimation

Journal of Computer and System Sciences - Special issue on PODS 2000
Dynamic sample selection for approximate query processing

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Aurora: a new model and architecture for data stream management

The VLDB Journal — The International Journal on Very Large Data Bases
Hierarchical dwarfs for the rollup cube

DOLAP '03 Proceedings of the 6th ACM international workshop on Data warehousing and OLAP
DSQoS-distributed architecture providing QoS in summary warehouses

DOLAP '03 Proceedings of the 6th ACM international workshop on Data warehousing and OLAP
Efficient dynamic mining of constrained frequent sets

ACM Transactions on Database Systems (TODS)
Probabilistic wavelet synopses

ACM Transactions on Database Systems (TODS)
Approximate Temporal Aggregation

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Spatio-Temporal Aggregation Using Sketches

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Hash-Merge Join: A Non-blocking Join Algorithm for Producing Fast and Early Join Results

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Approximate Selection Queries over Imprecise Data

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Load Shedding for Aggregation Queries over Data Streams

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Evaluating holistic aggregators efficiently for very large datasets

The VLDB Journal — The International Journal on Very Large Data Bases
Expressing and optimizing sequence queries in database systems

ACM Transactions on Database Systems (TODS)
A bi-level Bernoulli scheme for database sampling

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Online maintenance of very large random samples

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Adapting to source properties in processing data integration queries

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
The price of validity in dynamic networks

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Approximation techniques for spatial data

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Toward a progress indicator for database queries

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Estimating progress of execution for SQL queries

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Query sampling in DB2 Universal Database

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Optimization of data stream processing

ACM SIGMOD Record
Balancing energy efficiency and quality of aggregate data in sensor networks

The VLDB Journal — The International Journal on Very Large Data Bases
Spatiotemporal Aggregate Computation: A Survey

IEEE Transactions on Knowledge and Data Engineering
A Distributed System for Answering Range Queries on Sensor Network Data

PERCOMW '05 Proceedings of the Third IEEE International Conference on Pervasive Computing and Communications Workshops
Optimization of in-network data reduction

DMSN '04 Proceeedings of the 1st international workshop on Data management for sensor networks: in conjunction with VLDB 2004
Deterministic wavelet thresholding for maximum-error metrics

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Selectivity estimators for multidimensional range queries over real attributes

The VLDB Journal — The International Journal on Very Large Data Bases
TAG: a Tiny AGgregation service for Ad-Hoc sensor networks

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Estimating arbitrary subset sums with few probes

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A disk-based join with probabilistic guarantees

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
When can we trust progress estimators for SQL queries?

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Relational confidence bounds are easy with the bootstrap

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Integrated resource management for data stream systems

Proceedings of the 2005 ACM symposium on Applied computing
Space efficiency in synopsis construction algorithms

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Online estimation for subset-based SQL queries

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Providing probabilistically-bounded approximate answers to non-holistic aggregate range queries in OLAP

Proceedings of the 8th ACM international workshop on Data warehousing and OLAP
Towards estimating the number of distinct value combinations for a set of attributes

Proceedings of the 14th ACM international conference on Information and knowledge management
Wavelet synopses for general error metrics

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2004
Improving range-sum query evaluation on data cubes via polynomial approximation

Data & Knowledge Engineering
Sample-Based Quality Estimation of Query Results in Relational Database Environments

IEEE Transactions on Knowledge and Data Engineering
Supporting ad-hoc ranking aggregates

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Robust computation of aggregates in wireless sensor networks: distributed randomized algorithms and analysis

IPSN '05 Proceedings of the 4th international symposium on Information processing in sensor networks
DSM-PLW: single-pass mining of path traversal patterns over streaming web click-sequences

Computer Networks: The International Journal of Computer and Telecommunications Networking - Web dynamics
SimFlex: Statistical Sampling of Computer System Simulation

IEEE Micro
Robust Computation of Aggregates in Wireless Sensor Networks: Distributed Randomized Algorithms and Analysis

IEEE Transactions on Parallel and Distributed Systems
Adaptive execution of variable-accuracy functions

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
On biased reservoir sampling in the presence of stream evolution

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Delay aware querying with seaweed

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Window-aware load shedding for aggregation queries over data streams

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient detection of empty-result queries

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Pre-aggregation with probability distributions

DOLAP '06 Proceedings of the 9th ACM international workshop on Data warehousing and OLAP
The Sort-Merge-Shrink join

ACM Transactions on Database Systems (TODS)
Online Random Shuffling of Large Database Tables

IEEE Transactions on Knowledge and Data Engineering
Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more

Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more
The price of validity in dynamic networks

Journal of Computer and System Sciences
Approximate range---sum query answering on data cubes with probabilistic guarantees

Journal of Intelligent Information Systems
Optimized stratified sampling for approximate query processing

ACM Transactions on Database Systems (TODS)
Extended wavelets for multiple measures

ACM Transactions on Database Systems (TODS)
Scalable approximate query processing with the DBO engine

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Sketching unaggregated data streams for subpopulation-size queries

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
ProgME: towards programmable network measurement

Proceedings of the 2007 conference on Applications, technologies, architectures, and protocols for computer communications
Monitoring streams: a new class of data management applications

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Progressive merge join: a generic and non-blocking sort-based join algorithm

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficient exploration of large scientific databases

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Algorithms and estimators for accurate summarization of internet traffic

Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
Priority sampling for estimation of arbitrary subset sums

Journal of the ACM (JACM)
The history of histograms (abridged)

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Robust estimation with sampling and approximate pre-aggregation

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
ATLAS: a small but complete SQL extension for data mining and data streams

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
XWAVE: optimal and approximate extended wavelets

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Query languages and data models for database sequences and data streams

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Model-driven data acquisition in sensor networks

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
A Bayesian method for guessing the extreme values in a data set?

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Randomized algorithms for data reconciliation in wide area aggregate query processing

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Processing forecasting queries

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Supporting time-constrained SQL queries in oracle

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Deterministic algorithms for sampling count data

Data & Knowledge Engineering
Adaptive-sampling algorithms for answering aggregation queries on Web sites

Data & Knowledge Engineering
Speculative plan execution for information gathering

Artificial Intelligence
Graph summarization with bounded error

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
OLAP on sequence data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
MCDB: a monte carlo approach to managing uncertain data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
The DBO database system

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Approximating predicates and expressive queries on probabilistic databases

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Extracting k most important groups from data efficiently

Data & Knowledge Engineering
Probabilistic top-k and ranking-aggregate queries

ACM Transactions on Database Systems (TODS)
Confidence bounds for sampling-based group by estimates

ACM Transactions on Database Systems (TODS)
Maintaining very large random samples using the geometric file

The VLDB Journal — The International Journal on Very Large Data Bases
Wavelet synopsis for hierarchical range queries with workloads

The VLDB Journal — The International Journal on Very Large Data Bases
A survey of top-k query processing techniques in relational database systems

ACM Computing Surveys (CSUR)
Online mining of frequent sets in data streams with error guarantee

Knowledge and Information Systems
A research agenda for query processing in large-scale peer data management systems

Information Systems
Scalable approximate query processing with the DBO engine

ACM Transactions on Database Systems (TODS)
Plot Query Processing with Wavelets

SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
On the space---time of optimal, approximate and streaming algorithms for synopsis construction problems

The VLDB Journal — The International Journal on Very Large Data Bases
Architecture of a Database System

Foundations and Trends in Databases
Improving estimation accuracy of aggregate queries on data cubes

Proceedings of the ACM 11th international workshop on Data warehousing and OLAP
ODMCA: An adaptive data mining control algorithm in multicarrier networks

Computer Communications
New join operator definitions for sensor network databases

AEE'07 Proceedings of the 6th conference on Applications of electrical engineering
The design of a query monitoring system

ACM Transactions on Database Systems (TODS)
Semantics and implementation of continuous sliding window queries over data streams

ACM Transactions on Database Systems (TODS)
Flexible and efficient querying and ranking on hyperlinked data sources

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Guessing the extreme values in a data set: a Bayesian method and its applications

The VLDB Journal — The International Journal on Very Large Data Bases
Sampling-based estimators for subset-based queries

The VLDB Journal — The International Journal on Very Large Data Bases
AMID: Approximation of MultI-measured Data using SVD

Information Sciences: an International Journal
Data reduction for data analysis

ECC'08 Proceedings of the 2nd conference on European computing conference
Progressive Evaluation of XML Queries for Online Aggregation and Progress Indicator

DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Learning value predictors for the speculative execution of information gathering plans

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Inductive learning in less than one sequential data scan

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Improving estimation accuracy of aggregate queries on data cubes

Data & Knowledge Engineering
Statistical structures for Internet-scale data management

The VLDB Journal — The International Journal on Very Large Data Bases
MAD skills: new analysis practices for big data

Proceedings of the VLDB Endowment
Enabling ε-approximate querying in sensor networks

Proceedings of the VLDB Endowment
Turbo-charging estimate convergence in DBO

Proceedings of the VLDB Endowment
Distributed online aggregations

Proceedings of the VLDB Endowment
Tuning database configuration parameters with iTuned

Proceedings of the VLDB Endowment
An experimental study of time-constrained aggregate queries

Proceedings of the 13th International Conference on Extending Database Technology
Quality contracts for real-time enterprises

BIRTE'06 Proceedings of the 1st international conference on Business intelligence for the real-time enterprises
Beyond average: toward sophisticated sensing with queries

IPSN'03 Proceedings of the 2nd international conference on Information processing in sensor networks
On the variance of subset sum estimation

ESA'07 Proceedings of the 15th annual European conference on Algorithms
Beyond online aggregation: parallel and incremental data mining with online Map-Reduce

Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
A new virtual select database operation for wireless sensor networks

EUC'07 Proceedings of the 2007 conference on Emerging direction in embedded and ubiquitous computing
Index structures for data warehouses

Index structures for data warehouses
PR-join: a non-blocking join achieving higher early result rate with statistical guarantees

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
ParaTimer: a progress indicator for MapReduce DAGs

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Continuous sampling for online aggregation over multiple queries

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Online aggregation and continuous query support in MapReduce

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Dynamically provisioning distributed systems to meet target levels of performance, availability, and data quality

Future directions in distributed computing
IRSJ: incremental refining spatial joins for interactive queries in GIS

Geoinformatica
MapReduce online

NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
The declarative imperative: experiences and conjectures in distributed logic

ACM SIGMOD Record
Processing exact results for sliding window joins over data streams using disk storage

International Journal of Intelligent Information and Database Systems
Identifying the challenges for optimizing the process to achieve reproducible results in e-science applications

PIKM '10 Proceedings of the 3rd workshop on Ph.D. students in information and knowledge management
Approximate query answering and result refinement on XML data

SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Towards approximate SQL: infobright's approach

RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
Efficiently computing and querying multidimensional OLAP data cubes over probabilistic relational data

ADBIS'10 Proceedings of the 14th east European conference on Advances in databases and information systems
Effective and efficient sampling methods for deep web aggregation queries

Proceedings of the 14th International Conference on Extending Database Technology
ProgME: towards programmable network measurement

IEEE/ACM Transactions on Networking (TON)
Beyond simple aggregates: indexing for summary queries

Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A latency and fault-tolerance optimizer for online parallel query plans

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Efficient approximate top-k query algorithm using cube index

APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
Sequenced spatiotemporal aggregation for coarse query granularities

The VLDB Journal — The International Journal on Very Large Data Bases
Effective stratification for low selectivity queries on deep web data sources

Proceedings of the 20th ACM international conference on Information and knowledge management
Aggregation strategies for columnar in-memory databases in a mixed workload

Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management
Efficient non-blocking top-k query processing in distributed networks

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Randomized accuracy-aware program transformations for efficient approximate computations

POPL '12 Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A probabilistic framework for estimating the accuracy of aggregate range queries evaluated over histograms

Information Sciences: an International Journal
A single-pass online data mining algorithm combined with control theory with limited memory in dynamic data streams

GCC'05 Proceedings of the 4th international conference on Grid and Cooperative Computing
An interactive framework for spatial joins: a statistical approach to data analysis in GIS

Geoinformatica
Hierarchical group-based sampling

BNCOD'05 Proceedings of the 22nd British National conference on Databases: enterprise, Skills and Innovation
A programmable pipelined queue for approximate string matching

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part IV
Estimating the overlapping area of polygon join

SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
An incremental refining spatial join algorithm for estimating query results in GIS

DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Approximate query processing for database flexible querying with aggregates

Transactions on Large-Scale Data- and Knowledge-Centered Systems V
Trust me, i'm partially right: incremental visualization lets analysts explore large datasets faster

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
What next?: a half-dozen data management research goals for big data and the cloud

PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
SkewTune: mitigating skew in mapreduce applications

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Improving online aggregation performance for skewed data distribution

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Halt or continue: estimating progress of queries in the cloud

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part II
Research directions in data wrangling: visuatizations and transformations for usable and credible data

Information Visualization - Special issue on State of the Field and New Research Directions
Early accurate results for advanced analytics on MapReduce

Proceedings of the VLDB Endowment
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches

Foundations and Trends in Databases
Histograms as statistical estimators for aggregate queries

Information Systems
You can stop early with COLA: online processing of aggregate queries in the cloud

Proceedings of the 21st ACM international conference on Information and knowledge management
Tiled-MapReduce: Efficient and Flexible MapReduce Processing on Multicore with Tiling

ACM Transactions on Architecture and Code Optimization (TACO)
Taming massive distributed datasets: data sampling using bitmap indices

Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
BlinkDB: queries with bounded errors and bounded response times on very large data

Proceedings of the 8th ACM European Conference on Computer Systems
Driver input selection for main-memory multi-way joins

Proceedings of the 28th Annual ACM Symposium on Applied Computing
Parallel online aggregation in action

Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Bottom-k and priority sampling, set similarity and subset sums with minimal independence

Proceedings of the forty-fifth annual ACM symposium on Theory of computing
Ad-hoc aggregate query processing algorithms based on bit-store for query intensive applications in cloud computing

Future Generation Computer Systems
Distributed data management using MapReduce

ACM Computing Surveys (CSUR)
pEDM: online-forecasting for smart energy analytics

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Indexing for summary queries: Theory and practice

ACM Transactions on Database Systems (TODS)
Processing online aggregation on skewed data in mapreduce

Proceedings of the fifth international workshop on Cloud data management
Memory-efficient groupby-aggregate using compressed buffer trees

Proceedings of the 4th annual Symposium on Cloud Computing
Sampling estimators for parallel online aggregation

BNCOD'13 Proceedings of the 29th British National conference on Big Data
Scalable progressive analytics on big data in the cloud

Proceedings of the VLDB Endowment
A sampling algebra for aggregate estimation

Proceedings of the VLDB Endowment
Optimizing Sample Design for Approximate Query Processing

International Journal of Knowledge-Based Organizations
imMens: real-time visual querying of big data

EuroVis '13 Proceedings of the 15th Eurographics Conference on Visualization
GRASS: trimming stragglers in approximation analytics

NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Aggregation in traditional database systems is performed in batch mode: a query is submitted, the system processes a large volume of data over a long period of time, and, eventually, the final answer is returned. This archaic approach is frustrating to users and has been abandoned in most other areas of computing. In this paper we propose a new online aggregation interface that permits users to both observe the progress of their aggregation queries and control execution on the fly. After outlining usability and performance requirements for a system supporting online aggregation, we present a suite of techniques that extend a database system to meet these requirements. These include methods for returning the output in random order, for providing control over the relative rate at which different aggregates are computed, and for computing running confidence intervals. Finally, we report on an initial implementation of online aggregation in POSTGRES.