CSV: visualizing and mining cohesive subgraphs

Authors:
Nan Wang;Srinivasan Parthasarathy;Kian-Lee Tan;Anthony K. H. Tung
Affiliations:
National University of Singapore, Singapore, Singapore;The Ohio State University, Columbus, OH, USA;National University of Singapore, Singapore, Singapore;National University of Singapore, Singapore, Singapore
Venue:
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Year:
2008

Citing 23
Cited 20

FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
OPTICS: ordering points to identify the clustering structure

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
The value of strong inapproximability results for clique

STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
Parallel multilevel k-way partitioning scheme for irregular graphs

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Complete Mining of Frequent Patterns from Graphs: Mining Graph Data

Machine Learning
Similarity Search without Tears: The OMNI Family of All-purpose Access Methods

Proceedings of the 17th International Conference on Data Engineering
Frequent Subgraph Discovery

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Massive Quasi-Clique Detection

LATIN '02 Proceedings of the 5th Latin American Symposium on Theoretical Informatics
Keyword Searching and Browsing in Databases using BANKS

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Pivot selection techniques for proximity searching in metric spaces

Pattern Recognition Letters
Carpenter: finding closed patterns in long biological datasets

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
COBBLER: Combining Column and Row Enumeration for Closed Pattern Discovery

SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
FARMER: finding interesting rule groups in microarray datasets

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Extremal Graph Theory

Extremal Graph Theory
Mining Frequent Closed Patterns in Microarray Data

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Mining top-K covering rule groups for gene expression data

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Mining closed relational graphs with connectivity constraints

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Mining coherent dense subgraphs across massive biological networks for functional discovery

Bioinformatics
CLAN: An Algorithm for Mining Closed Cliques from Large Dense Graph Databases

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Mining market data: a network approach

Computers and Operations Research
Coherent closed quasi-clique discovery from large dense graph databases

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient IR-style keyword search over relational databases

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29

A visual-analytic toolkit for dynamic interaction graphs

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Ranking-based clustering of heterogeneous information networks with star network schema

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Graph OLAP: a multi-dimensional framework for graph data analysis

Knowledge and Information Systems
Mining near-duplicate graph for cluster-based reranking of web video search results

ACM Transactions on Information Systems (TOIS)
DESSIN: mining dense subgraph patterns in a single graph

SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
On triangulation-based dense neighborhood graph discovery

Proceedings of the VLDB Endowment
TGP: mining top-K frequent closed graph pattern without minimum support

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Mining multi-tag association for image tagging

World Wide Web
Efficient topological OLAP on information networks

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
Content-driven detection of campaigns in social media

Proceedings of the 20th ACM international conference on Information and knowledge management
CP-index: on the efficient indexing of large graphs

Proceedings of the 20th ACM international conference on Information and knowledge management
FUSE: a system for data-driven multi-level functional summarization of protein interaction networks

Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
Fuse: towards multi-level functional summarization of protein interaction networks

Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Dense subgraph maintenance under streaming edge weight updates for real-time story identification

Proceedings of the VLDB Endowment
Integrating meta-path selection with user-guided object clustering in heterogeneous information networks

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Large scale cohesive subgraphs discovery for social network visual analysis

Proceedings of the VLDB Endowment
PathSelClus: Integrating Meta-Path Selection with User-Guided Object Clustering in Heterogeneous Information Networks

ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on ACM SIGKDD 2012
Database research at the National University of Singapore

ACM SIGMOD Record
PLASMA-HD: probing the lattice structure and makeup of high-dimensional data

Proceedings of the VLDB Endowment
Campaign extraction from social media

ACM Transactions on Intelligent Systems and Technology (TIST) - Special Section on Intelligent Mobile Knowledge Discovery and Management Systems and Special Issue on Social Web Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Extracting dense sub-components from graphs efficiently is an important objective in a wide range of application domains ranging from social network analysis to biological network analysis, from the World Wide Web to stock market analysis. Motivated by this need recently we have seen several new algorithms to tackle this problem based on the (frequent) pattern mining paradigm. A limitation of most of these methods is that they are highly sensitive to parameter settings, rely on exhaustive enumeration with exponential time complexity, and often fail to help the users understand the underlying distribution of components embedded within the host graph. In this article we propose an approximate algorithm, to mine and visualize cohesive subgraphs (dense sub components) within a large graph. The approach, refereed to as Cohesive Subgraph Visualization (CSV) relies on a novel mapping strategy that maps edges and nodes to a multi-dimensional space wherein dense areas in the mapped space correspond to cohesive subgraphs. The algorithm then walks through the dense regions in the mapped space to output a visual plot that effectively captures the overall dense sub-component distribution of the graph. Unlike extant algorithms with exponential complexity, CSV has a complexity of O(V2logV) when fixing the parameter mapping dimension, where V corresponds to the number of vertices in the graph, although for many real datasets the performance is typically sub-quadratic. We demonstrate the utility of CSV as a stand-alone tool for visual graph exploration and as a pre-filtering step to significantly scale up exact subgraph mining algorithms such as CLAN.