Cooperative scans: dynamic bandwidth sharing in a DBMS
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
MRShare: sharing across multiple queries in MapReduce
Proceedings of the VLDB Endowment
Hadoop++: making a yellow elephant run like a cheetah (without it even noticing)
Proceedings of the VLDB Endowment
Merging what's cracked, cracking what's merged: adaptive indexing in main-memory column-stores
Proceedings of the VLDB Endowment
Trojan data layouts: right shoes for a running elephant
Proceedings of the 2nd ACM Symposium on Cloud Computing
Only aggressive elephants are fast elephants
Proceedings of the VLDB Endowment
NoDB in action: adaptive query processing on raw data
Proceedings of the VLDB Endowment
Invisible loading: access-driven data transfer from raw files into database systems
Proceedings of the 16th International Conference on Extending Database Technology
CARTILAGE: adding flexibility to the Hadoop skeleton
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Hi-index | 0.00 |
Mosquito is a lightweight and adaptive physical design framework for Hadoop. Mosquito connects to existing data pipelines in Hadoop MapReduce and/or HDFS, observes the data, and creates better physical designs, i.e. indexes, as a byproduct. Our approach is minimally invasive, yet it allows users and developers to easily improve the runtime of Hadoop. We present three important use cases: first, how to create indexes as a byproduct of data uploads into HDFS; second, how to create indexes as a byproduct of map tasks; and third, how to execute map tasks as a byproduct of HDFS data uploads. These use cases may even be combined.