Adaptive Parallel Graph Mining for CMP Architectures

Authors:
Gregory Buehrer;Srinivasan Parthasarathy;Yen-Kuang Chen
Affiliations:
The Ohio State University, USA;The Ohio State University, USA;Intel Corporation, USA
Venue:
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Year:
2006

Citing 0
Cited 9

Global trees: a framework for linked data structures on distributed memory parallel systems

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Improving support for locality and fine-grain sharing in chip multiprocessors

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
A viewpoint-based approach for interaction graph analysis

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Performance Issues in Parallelizing Data-Intensive Applications on a Multi-core Cluster

CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Mining tree-structured data on multicore systems

Proceedings of the VLDB Endowment
MARGIN: Maximal frequent subgraph mining

ACM Transactions on Knowledge Discovery from Data (TKDD)
SPACE: sharing pattern-based directory coherence for multicore scalability

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
PGP-mc: towards a multicore parallel approach for mining gradual patterns

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part I
Para Miner: a generic pattern mining algorithm for multi-core architectures

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mining graph data is an increasingly popular challenge, which has practical applications in many areas, including molecular substructure discovery, web link analysis, fraud detection, and social network analysis. The problem statement is to enumerate all subgraphs occurring in at least \sigmagraphs of a database, where \sigmais a user specified parameter. Chip Multiprocessors (CMPs) provide true parallel processing, and are expected to become the de facto standard for commodity computing. In this work, building on the state-of-the-art, we propose an efficient approach to parallelize such algorithms for CMPs. We show that an algorithm which adapts its behavior based on the runtime state of the system can improve system utilization and lower execution times. Most notably, we incorporate dynamic state management to allow memory consumption to vary based on availability. We evaluate our techniques on current day shared memory systems (SMPs) and expect similar performance for CMPs. We demonstrate excellent speedup, 27- fold on 32 processors for several real world datasets. Additionally, we show our dynamic techniques afford this scalability while consuming up to 35% less memory than static techniques.