Parallel database systems: the future of high performance database systems
Communications of the ACM
C4.5: programs for machine learning
C4.5: programs for machine learning
Server-directed collective I/O in Panda
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
The Vesta parallel file system
ACM Transactions on Computer Systems (TOCS)
The galley parallel file system
ICS '96 Proceedings of the 10th international conference on Supercomputing
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Parallel I/O for high performance computing
Parallel I/O for high performance computing
Real world performance of association rule algorithms
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Parallel data mining for association rules on shared memory systems
Knowledge and Information Systems
MPI-IO/GPFS, an optimized implementation of MPI-IO on top of GPFS
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Efficient Progressive Sampling for Association Rules
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
MS-I/O: A Distributed Multi-Storage I/O System
CCGRID '02 Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid
Mining distance-based outliers in near linear time with randomization and a simple pruning rule
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Frequent Itemsets from Secondary Memory
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Cache-conscious frequent pattern mining on a modern processor
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Design of a next generation sampling service for large scale data analysis applications
Proceedings of the 19th annual international conference on Supercomputing
A characterization of data mining algorithms on a modern processor
DaMoN '05 Proceedings of the 1st international workshop on Data management on new hardware
Disk-directed I/O for MIMD multiprocessors
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
PVFS: a parallel file system for linux clusters
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hi-index | 0.00 |
Advances in data collection and storage technologies have given rise to large dynamic data stores. In order to effectively manage and mine such stores on modern and emerging architectures, one must consider both designing effective middleware support and re-architecting algorithms, to derive performance that commensurates with technological advances. In this article, we present a topdown view of how one can achieve this goal for next generation data analysis centers. Specifically, we present a case study on frequent pattern algorithms, and show how such algorithms can be re-structured to be cache, memory and I/O conscious. Furthermore, motivated by such algorithms, we present a services oriented middleware framework for the derivation of high performance on next generation architectures.