ONE: a predictable and scalable DW model
DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
A predictable storage model for scalable parallel DW
Proceedings of the 15th Symposium on International Database Engineering & Applications
Estimating the overlapping area of polygon join
SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
TEEPA: a timely-aware elastic parallel architecture
Proceedings of the 16th International Database Engineering & Applications Sysmposium
Overcoming the scalability limitations of parallel star schema data warehouses
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Providing timely results with an elastic parallel DW
ISMIS'12 Proceedings of the 20th international conference on Foundations of Intelligent Systems
Cloudy: heterogeneous middleware for in time queries processing
Proceedings of the 17th International Database Engineering & Applications Symposium
Hi-index | 0.00 |
In large data warehousing environments, it isoften advantageous to provide fast, approximateanswers to complex aggregate queries based onsamples. However, uniformly extracted samplesoften do not guarantee acceptable accuracy ingrouping interval estimations. This is crucial inmost less-aggregated analyses, which are mostlybased on recent data (e.g.forecasting,performance analysis). We propose the use oftime-interval stratified samples (TISS), a simplesampling strategy that biases towards recency.This improves the accuracy in important less-aggregated analysis without significantlydeteriorating aggregated analysis on older data.TISS obtains a much better accuracy thaneither uniform or the recently proposedcongressional samples (CS) for queries analyzingrecent data and can be coupled with CS to provideminimal representation guarantees (TISS-CS).We discuss TISS design, the loading processand the query processing middle-layer. We showthat TISS is very easily integrated in a datawarehouse and works transparently. TISS isevaluated experimentally in a TPC-H setup.