Efficient aggregation for graph summarization

Authors:
Yuanyuan Tian;Richard A. Hankins;Jignesh M. Patel
Affiliations:
University of Michigan, Ann Arbor, MI, USA;Nokia Research Center, Palo Alto, CA, USA;University of Michigan, Ann Arbor, MI, USA
Venue:
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Year:
2008

Citing 15
Cited 49

An Efficient Algorithm for Graph Isomorphism

Journal of the ACM (JACM)
Graph Drawing: Algorithms for the Visualization of Graphs

Graph Drawing: Algorithms for the Visualization of Graphs
Graph Visualization and Navigation in Information Visualization: A Survey

IEEE Transactions on Visualization and Computer Graphics
Compact representations of separable graphs

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
gSpan: Graph-Based Substructure Pattern Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
State of the art of graph-based data mining

ACM SIGKDD Explorations Newsletter
The webgraph framework I: compression techniques

Proceedings of the 13th international conference on World Wide Web
SPIN: mining maximal frequent subgraphs from graph databases

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
GraphMiner: a structural pattern-mining system for large disk-based graph databases and its applications

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Graph mining: Laws, generators, and algorithms

ACM Computing Surveys (CSUR)
The political blogosphere and the 2004 U.S. election: divided they blog

Proceedings of the 3rd international workshop on Link discovery
SuperGraph Visualization

ISM '06 Proceedings of the Eighth IEEE International Symposium on Multimedia
Visualization of large networks with min-cut plots, A-plots and R-MAT

International Journal of Human-Computer Studies
SCAN: a structural clustering algorithm for networks

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Less is More: Sparse Graph Mining with Compact Matrix Decomposition

Statistical Analysis and Data Mining

Periscope/GQ: a graph querying toolkit

Proceedings of the VLDB Endowment
Ranking-based clustering of heterogeneous information networks with star network schema

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
3-HOP: a high-compression indexing scheme for reachability query

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
A Bipartite Graph Framework for Summarizing High-Dimensional Binary, Categorical and Numeric Data

SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Graph OLAP: a multi-dimensional framework for graph data analysis

Knowledge and Information Systems
Graph clustering based on structural/attribute similarities

Proceedings of the VLDB Endowment
Mining graph patterns efficiently via randomized summaries

Proceedings of the VLDB Endowment
A compact representation of graph databases

Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Towards query optimization for the data web: disk-based algorithms: trace equivalence and bisimilarity

Proceedings of the 1st International Conference on Intelligent Semantic Web-Services and Applications
Clustering Large Attributed Graphs: A Balance between Structural and Attribute Similarities

ACM Transactions on Knowledge Discovery from Data (TKDD)
Structure and attribute index for approximate graph matching in large graphs

Information Systems
Graph cube: on warehousing and OLAP multidimensional networks

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
On summarizing graph homogeneously

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications
Efficient topological OLAP on information networks

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
Compression of weighted graphs

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
On sampling type distribution from heterogeneous social networks

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Visualisation de digests d'emails en entreprise

23rd French Speaking Conference on Human-Computer Interaction
Ranking objects by following paths in entity-relationship graphs

Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management
Topic oriented community detection through social objects and link analysis in social networks

Knowledge-Based Systems
Structured data clouding across multiple webs

Information Systems
Community detection in incomplete information networks

Proceedings of the 21st international conference on World Wide Web
Applications of Geometry Processing: Interactive visual queries for multivariate graphs exploration

Computers and Graphics
Collaborative similarity measure for intra graph clustering

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications
Business intelligence on complex graph data

Proceedings of the 2012 Joint EDBT/ICDT Workshops
Summarization-based mining bipartite graphs

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Network compression by node and edge mergers

Bisociative Knowledge Discovery
Clouding services for linked data exploration

CAiSE'12 Proceedings of the 24th international conference on Advanced Information Systems Engineering
Mining knowledge from interconnected data: a heterogeneous information network analysis approach

Proceedings of the VLDB Endowment
A sock puppet detection algorithm on virtual spaces

Knowledge-Based Systems
A framework and a language for on-line analytical processing on graphs

WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Thematic clustering and exploration of linked data

Search Computing
Speeding up graph clustering via modular decomposition based compression

Proceedings of the 28th Annual ACM Symposium on Applied Computing
Mining heterogeneous information networks: a structural analysis approach

ACM SIGKDD Explorations Newsletter
SynopSys: large graph analytics in the SAP HANA database through summarization

First International Workshop on Graph Data Management Experiences and Systems
Social influence based clustering of heterogeneous information networks

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient community detection in large networks using content and links

Proceedings of the 22nd international conference on World Wide Web
Frequent conceptual links and link-based clustering: a comparative analysis of two clustering techniques

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Event detection using user interaction behavior models

Artificial Intelligence Review
External memory K-bisimulation reduction of big graphs

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Efficiency and precision trade-offs in graph summary algorithms

Proceedings of the 17th International Database Engineering & Applications Symposium
From Frequent Features to Frequent Social Links

International Journal of Information System Modeling and Design
Frequent subgraph summarization with error control

WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Probabilistic graph summarization

WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
A game theory based approach for community detection in social networks

BNCOD'13 Proceedings of the 29th British National conference on Big Data
Evaluating community detection using a bi-objective optimization

ICIC'13 Proceedings of the 9th international conference on Intelligent Computing Theories
Realtime analysis of information diffusion in social media

Proceedings of the VLDB Endowment
Summarizing answer graphs induced by keyword queries

Proceedings of the VLDB Endowment
Preferences in Wikipedia abstracts: Empirical findings and implications for automatic entity summarization

Information Processing and Management: an International Journal
Structure/attribute computation of similarities between nodes of a RDF graph with application to linked data clustering

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Graphs are widely used to model real world objects and their relationships, and large graph datasets are common in many application domains. To understand the underlying characteristics of large graphs, graph summarization techniques are critical. However, existing graph summarization methods are mostly statistical (studying statistics such as degree distributions, hop-plots and clustering coefficients). These statistical methods are very useful, but the resolutions of the summaries are hard to control. In this paper, we introduce two database-style operations to summarize graphs. Like the OLAP-style aggregation methods that allow users to drill-down or roll-up to control the resolution of summarization, our methods provide an analogous functionality for large graph datasets. The first operation, called SNAP, produces a summary graph by grouping nodes based on user-selected node attributes and relationships. The second operation, called k-SNAP, further allows users to control the resolutions of summaries and provides the "drill-down" and "roll-up" abilities to navigate through summaries with different resolutions. We propose an efficient algorithm to evaluate the SNAP operation. In addition, we prove that the k-SNAP computation is NP-complete. We propose two heuristic methods to approximate the k-SNAP results. Through extensive experiments on a variety of real and synthetic datasets, we demonstrate the effectiveness and efficiency of the proposed methods.