Optimization of nested queries in a complex object model
EDBT '94 Proceedings of the 4th international conference on extending database technology: Advances in database technology
On computing correlated aggregates over continual data streams
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Efficient OLAP Query Processing in Distributed Data Warehouses
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
The MD-join: An Operator for Complex OLAP
Proceedings of the 17th International Conference on Data Engineering
Querying Multiple Features of Groups in Relational Databases
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Spreadsheets in RDBMS for OLAP
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Efficient processing of joins on set-valued attributes
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Realizing parallelism in database operations: insights from a massively multithreaded architecture
DaMoN '06 Proceedings of the 2nd international workshop on Data management on new hardware
Using grouping variables to express complex decision support queries
Data & Knowledge Engineering
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A Spreadsheet Algebra for a Direct Data Manipulation Query Interface
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
COSTES: Continuous spreadsheet-like computations
ICDEW '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering Workshop
A comparison of approaches to large-scale data analysis
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads
Proceedings of the VLDB Endowment
Supporting real-time supply chain decisions based on RFID data streams
Journal of Systems and Software
Tagged mapreduce: efficiently computing multi-analytics using mapreduce
DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
Cost models for view materialization in the cloud
Proceedings of the 2012 Joint EDBT/ICDT Workshops
Hi-index | 0.00 |
Today's complex world requires state-of-the-art data analysis over truly massive data sets. These data sets can be stored persistently in databases or flat files, or can be generated in realtime in a continuous manner. An associated set is a collection of data sets, annotated by the values of a domain D. These data sets are populated using a data source according to a condition θ and the annotated value. An ASsociated SET (ASSET) query consists of repeated, successive, interrelated definitions of associated sets, put together in a column-wise fashion, resembling a spreadsheet document. We present DataMingler, a powerful GUI to express and manage ASSET queries, data sources and aggregate functions and the ASSET Query Engine (QE) to efficiently evaluate ASSET queries. We argue that ASSET queries: a) constitute a useful class of OLAP queries, b) are suitable for distributed processing settings, and c) extend the MapReduce paradigm in a declarative way.