A characterization of data mining algorithms on a modern processor
DaMoN '05 Proceedings of the 1st international workshop on Data management on new hardware
Cache-conscious frequent pattern mining on modern and emerging processors
The VLDB Journal — The International Journal on Very Large Data Bases
A sampling-based scheduling method for distributed computing
CISST'09 Proceedings of the 3rd WSEAS international conference on Circuits, systems, signal and telecommunications
A sampling-based method for dynamic scheduling in distributed data mining environment
WSEAS Transactions on Computers
Performance characterization of data mining benchmarks
Proceedings of the 2010 Workshop on Interaction between Compilers and Computer Architecture
International Journal of Distributed Systems and Technologies
Hi-index | 0.00 |
This paper characterizes the performance and memory-access behavior of a decision tree induction program, a previously unstudied application used in data mining and knowledge discovery in databases. Performance is studied via RSIM, an execution driven simulator, for three uniprocessor models that exploit instruction level parallelism to varying degrees. Several properties of the program are noted. Out-of-order dispatch and multiple issue provide a significant performance advantage: 50\%--250\% improvement in IPC for out-of-order dispatch versus in-order dispatch, and 5\%--120\% improvement in IPC for four-way issue versus single issue. Multiple issue provides a greater performance improvement for larger L2 cache sizes, when the program is limited by CPU performance; out-of-order dispatch provides a greater performance improvement for smaller L2 cache sizes. The program has a very small instruction footprint: for an 8-kB L1 instruction cache the instruction miss rate is below 0.1\%. A small (8 kB) L1 data cache is sufficient to capture most of the locality of the data references, resulting in L1 miss rates between 10\%--20\%. Increasing the size of the L2 data cache does not significantly improve performance until a significant fraction (over 1/4) of the dataset fits into the L2 cache. Lastly, a procedure is developed for scaling the cache sizes when using scaled-down datasets, allowing the results for smaller datasets to be used to predict the performance of full-sized datasets.