How to barter bits for chronons: compression and bandwidth trade offs for database scans
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
HelperCore_DB: Exploiting Multicore Technology for Databases
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Evaluating the Cache Architecture of Multicore Processors
PDP '08 Proceedings of the 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008)
Brighthouse: an analytic data warehouse for ad-hoc queries
Proceedings of the VLDB Endowment
PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Hi-index | 0.00 |
The power of contemporary processors is based more and more on multicore architectures. This kind of power is accessible only to parallel applications, which are able to provide work for each core. Creating a scalable parallel/multithreaded application efficiently using available cores is a difficult task, especially if I/O performance must be considered as well. We consider a multithreaded database loader with a compressing function. The performance of the loader is examined from a number of perspectives. Because compression is a computationally intensive task, parallel execution can potentially provide a big advantage in this case. A list of performance related areas we encountered is presented and discussed. We identify and verify tools allowing us to deal with specific performance areas. We find out, that only an orchestrated employment of several tools can bring the desired effect. The discussion provides a general procedure one can follow when improving the performance of multithreaded programs. Key performance areas specific to the database loader are pointed out. A special interest is directed towards performance variations observed when many parallel threads are active on a multicore CPU. A significant slowdown of computations is observed if many threads are computing simultaneously. The slowdown is related mainly to memory access and cache behavior and it is much larger for Core2 Quad system than a dual Xeon machine.