ACM Computing Surveys (CSUR)
The datacycle architecture for very high throughput database systems
SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
Parallel database systems: the future of high performance database systems
Communications of the ACM
Broadcast disks: data management for asymmetric communication environments
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Balancing push and pull for data broadcast
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Towards self-tuning data placement in parallel database systems
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
The state of the art in distributed query processing
ACM Computing Surveys (CSUR)
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Data Staging for On-Demand Broadcast
Proceedings of the 27th International Conference on Very Large Data Bases
Super-Scalar RAM-CPU Cache Compression
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
GPUTeraSort: high performance graphics co-processor sorting for large database management
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
ISPASS '03 Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software
A high-performance computing method for data allocation in distributed database systems
The Journal of Supercomputing
Query processing methods considering the deadline of queries for database broadcasting systems
Systems and Computers in Japan
Allocating Resources to Parallel Query Plans in Data Grids
GCC '07 Proceedings of the Sixth International Conference on Grid and Cooperative Computing
DB2 design advisor: integrated automatic physical database design
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Flexible and efficient IR using array databases
The VLDB Journal — The International Journal on Very Large Data Bases
Self-tuning database systems: a decade of progress
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Breaking the memory wall in MonetDB
Communications of the ACM - Surviving the data deluge
The Database Architecture Jigsaw Puzzle
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Spinning relations: high-speed networks for distributed join processing
Proceedings of the Fifth International Workshop on Data Management on New Hardware
Minimizing the Hidden Cost of RDMA
ICDCS '09 Proceedings of the 2009 29th IEEE International Conference on Distributed Computing Systems
Predictable performance and high query concurrency for data analytics
The VLDB Journal — The International Journal on Very Large Data Bases
The data cyclotron query processing scheme
ACM Transactions on Database Systems (TODS)
The database architectures research group at CWI
ACM SIGMOD Record
Just-in-time data distribution for analytical query processing
ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
Hi-index | 0.00 |
Distributed database systems exploit static workload characteristics to steer data fragmentation and data allocation schemes. However, the grand challenge of distributed query processing is to come up with a self-organizing architecture, which exploits all resources to manage the hot data set, minimize query response time, and maximize throughput without global co-ordination. In this paper, we introduce the Data Cyclotron architecture which addresses the challenges using turbulent data movement through a storage ring built from distributed main memory capitalizing modern remote-DMA facilities. Queries assigned to individual nodes interact with the Data Cyclotron by picking up data fragments continuously flowing around, i.e., the hot set. Each data fragment carries a level of interest (LOI) metric, which represents the cumulative query interest as the fragment passes around the ring multiple times. A fragment with a LOI below a given threshold, inversely proportional to the ring load, is pulled out to free up resources. This threshold is dynamically adjusted in a distributed manor based on ring characteristics and query needs. It optimizes the resource utilization keeping the average data access delay low. The proposed architecture has a modest impact on existing query execution engines. This is illustrated using an extensive validated simulation study for the Data Cyclotron protocols. The results underpin their robustness in turbulent workload scenarios as well as in the TPC-H scenario. Furthermore, we think that using state-of-the-art network technology, e.g., RDMA, could lead to even more promising results. The Data Cyclotron architecture opens a new vista for modern distributed database architectures with a plethora of research challenges barely scratched upon.