Multilevel k-way hypergraph partitioning
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
The webgraph framework I: compression techniques
Proceedings of the 13th international conference on World Wide Web
Information-theoretic tools for mining database structure from large data sets
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Mining compressed frequent-pattern sets
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Neighborhood Formation and Anomaly Detection in Bipartite Graphs
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Fast Random Walk with Restart and Its Applications
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Interpreting the data: Parallel analysis with Sawzall
Scientific Programming - Dynamic Grids and Worldwide Computing
Fast and practical indexing and querying of very large graphs
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Mining significant graph patterns by leap search
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Column-stores vs. row-stores: how different are they really?
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Data mining using high performance data clouds: experimental studies using sector and sphere
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
SCOPE: easy and efficient parallel processing of massive data sets
Proceedings of the VLDB Endowment
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
SmallBlue: Social Network Analysis for Expertise Search and Collective Intelligence
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
On compressing social networks
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
BBM: bayesian browsing model from petabyte-scale data
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
An architecture for recycling intermediates in a column-store
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Distributed data-parallel computing using a high-level programming language
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Column-oriented database systems
Proceedings of the VLDB Endowment
GConnect: a connectivity index for massive disk-resident graphs
Proceedings of the VLDB Endowment
Finding a maximum-weight induced k-partite subgraph of an i-triangulated graph
Discrete Applied Mathematics
Approximating betweenness centrality
WAW'07 Proceedings of the 5th international conference on Algorithms and models for the web-graph
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Positional update handling in column stores
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Fast nearest-neighbor search in disk-resident graphs
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Neighbor query friendly compression of social networks
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Realistic, mathematically tractable graph generation and evolution, using kronecker multiplication
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
OddBall: spotting anomalies in weighted graphs
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Managing and mining large graphs: patterns and algorithms
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
GigaTensor: scaling tensor analysis up by 100 times - algorithms and discoveries
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
gbase: an efficient analysis platform for large graphs
The VLDB Journal — The International Journal on Very Large Data Bases
Partial view selection for evolving social graphs
First International Workshop on Graph Data Management Experiences and Systems
TurboGraph: a fast parallel graph engine handling billion-scale graphs in a single PC
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
A first view of exedra: a domain-specific language for large graph analytics workflows
Proceedings of the 22nd international conference on World Wide Web companion
The family of mapreduce and large-scale data processing systems
ACM Computing Surveys (CSUR)
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Graphs appear in numerous applications including cyber-security, the Internet, social networks, protein networks, recommendation systems, and many more. Graphs with millions or even billions of nodes and edges are common-place. How to store such large graphs efficiently? What are the core operations/queries on those graph? How to answer the graph queries quickly? We propose GBASE, a scalable and general graph management and mining system. The key novelties lie in 1) our storage and compression scheme for a parallel setting and 2) the carefully chosen graph operations and their efficient implementation. We designed and implemented an instance of GBASE using MapReduce/Hadoop. GBASE provides a parallel indexing mechanism for graph mining operations that both saves storage space, as well as accelerates queries. We ran numerous experiments on real graphs, spanning billions of nodes and edges, and we show that our proposed GBASE is indeed fast, scalable and nimble, with significant savings in space and time.