Three partition refinement algorithms
SIAM Journal on Computing
An implementation of an efficient algorithm for bisimulation equivalence
Science of Computer Programming
Communication and Concurrency
Covering indexes for branching path queries
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Object Exchange Across Heterogeneous Information Sources
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
ICDT '97 Proceedings of the 6th International Conference on Database Theory
Index Structures for Path Expressions
ICDT '99 Proceedings of the 7th International Conference on Database Theory
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Concurrency and Automata on Infinite Sequences
Proceedings of the 5th GI-Conference on Theoretical Computer Science
D(k)-index: an adaptive structural summary for graph-structured data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Exploiting Local Similarity for Indexing Paths in Graph-Structured Data
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Graph summarization with bounded error
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficient aggregation for graph summarization
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
ExpLOD: summary-based exploration of interlinking and RDF usage in the linked open data cloud
ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part II
A Query Formulation Language for the Data Web
IEEE Transactions on Knowledge and Data Engineering
Introducing RDF Graph Summary with Application to Assisted SPARQL Formulation
DEXA '12 Proceedings of the 2012 23rd International Workshop on Database and Expert Systems Applications
Structure inference for linked data sources using clustering
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Bisimulation reduction of big graphs on mapreduce
BNCOD'13 Proceedings of the 29th British National conference on Big Data
Hi-index | 0.00 |
In many applications, it is convenient to substitute a large data graph with a smaller homomorphic graph. This paper investigates approaches for summarising massive data graphs. In general, massive data graphs are processed using a shared-nothing infrastructure such as MapReduce. However, accurate graph summarisation algorithms are suboptimal for this kind of environment as they require multiple iterations over the data graph. We investigate approximate graph summarisation algorithms that are efficient to compute in a shared-nothing infrastructure. We define a quality assessment model of a summary with regards to a gold standard summary. We evaluate over several datasets the trade-offs between efficiency and precision of the algorithms. With regards to an application, experiments highlight the need to trade-off the precision and volume of a graph summary with the complexity of a summarisation technique.