Multilevel k-way hypergraph partitioning
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
Application of NASA General-Purpose Solver to Large-Scale Computations in Aeroacoustics
Application of NASA General-Purpose Solver to Large-Scale Computations in Aeroacoustics
The webgraph framework I: compression techniques
Proceedings of the 13th international conference on World Wide Web
Information-theoretic tools for mining database structure from large data sets
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Fully automatic cross-associations
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Mining compressed frequent-pattern sets
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Neighborhood Formation and Anomaly Detection in Bipartite Graphs
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Fast Random Walk with Restart and Its Applications
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Interpreting the data: Parallel analysis with Sawzall
Scientific Programming - Dynamic Grids and Worldwide Computing
Fast and practical indexing and querying of very large graphs
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Mining significant graph patterns by leap search
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Column-stores vs. row-stores: how different are they really?
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Data mining using high performance data clouds: experimental studies using sector and sphere
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
SCOPE: easy and efficient parallel processing of massive data sets
Proceedings of the VLDB Endowment
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
SmallBlue: Social Network Analysis for Expertise Search and Collective Intelligence
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
On compressing social networks
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
BBM: bayesian browsing model from petabyte-scale data
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
An architecture for recycling intermediates in a column-store
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Distributed data-parallel computing using a high-level programming language
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
A demonstration of SciDB: a science-oriented DBMS
Proceedings of the VLDB Endowment
Column-oriented database systems
Proceedings of the VLDB Endowment
GConnect: a connectivity index for massive disk-resident graphs
Proceedings of the VLDB Endowment
Finding a maximum-weight induced k-partite subgraph of an i-triangulated graph
Discrete Applied Mathematics
Approximating betweenness centrality
WAW'07 Proceedings of the 5th international conference on Algorithms and models for the web-graph
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Positional update handling in column stores
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Fast nearest-neighbor search in disk-resident graphs
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Neighbor query friendly compression of social networks
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
GBASE: a scalable and general graph management system
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Realistic, mathematically tractable graph generation and evolution, using kronecker multiplication
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Beyond 'Caveman Communities': Hubs and Spokes for Graph Compression and Mining
ICDM '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining
OddBall: spotting anomalies in weighted graphs
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Big graph mining: algorithms and discoveries
ACM SIGKDD Explorations Newsletter
TurboGraph: a fast parallel graph engine handling billion-scale graphs in a single PC
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Specialized storage for big numeric time series
HotStorage'13 Proceedings of the 5th USENIX conference on Hot Topics in Storage and File Systems
Hi-index | 0.00 |
Graphs appear in numerous applications including cyber security, the Internet, social networks, protein networks, recommendation systems, citation networks, and many more. Graphs with millions or even billions of nodes and edges are common-place. How to store such large graphs efficiently? What are the core operations/queries on those graph? How to answer the graph queries quickly? We propose Gbase, an efficient analysis platform for large graphs. The key novelties lie in (1) our storage and compression scheme for a parallel, distributed settings and (2) the carefully chosen graph operations and their efficient implementations. We designed and implemented an instance of Gbase using Mapreduce/Hadoop. Gbase provides a parallel indexing mechanism for graph operations that both saves storage space, as well as accelerates query responses. We run numerous experiments on real and synthetic graphs, spanning billions of nodes and edges, and we show that our proposed Gbase is indeed fast, scalable, and nimble, with significant savings in space and time.