Compact representation of large RDF data sets for publishing and exchange

Authors:
Javier D. Fernández;Miguel A. Martínez-Prieto;Claudio Gutierrez
Affiliations:
Department of Computer Science, Universidad de Valladolid, Spain;Department of Computer Science, Universidad de Valladolid, Spain and Department of Computer Science, Universidad de Chile, Chile;Department of Computer Science, Universidad de Chile, Chile
Venue:
ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Year:
2010

Citing 14
Cited 7

Compact pat trees

Compact pat trees
IBM Dictionary of Computing

IBM Dictionary of Computing
The webgraph framework I: compression techniques

Proceedings of the 13th international conference on World Wide Web
Foundations of semantic web databases

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
An efficient SQL-based RDF querying scheme

VLDB '05 Proceedings of the 31st international conference on Very large data bases
On Graph Features of Semantic Web Schemas

IEEE Transactions on Knowledge and Data Engineering
RDF-3X: a RISC-style engine for RDF

Proceedings of the VLDB Endowment
Sindice.com: a document-oriented lookup index for open linked data

International Journal of Metadata, Semantics and Ontologies
Compressed web indexes

Proceedings of the 18th international conference on World wide web
Scalable join processing on very large RDF graphs

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Semantics and complexity of SPARQL

ACM Transactions on Database Systems (TODS)
Matrix "Bit" loaded: a scalable lightweight join query processor for RDF data

Proceedings of the 19th international conference on World wide web
RDF compression: basic approaches

Proceedings of the 19th international conference on World wide web
Characterizing the semantic web on the web

ISWC'06 Proceedings of the 5th international conference on The Semantic Web

Compressed string dictionaries

SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Lightweighting the web of data through compact RDF/HDT

CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence
Binary RDF for scalable publishing, exchanging and consumption in the web of data

Proceedings of the 21st international conference companion on World Wide Web
Foundational aspects of semantic web optimization

PhD '12 Proceedings of the on SIGMOD/PODS 2012 PhD Symposium
Compression of RDF dictionaries

Proceedings of the 27th Annual ACM Symposium on Applied Computing
Querying RDF dictionaries in compressed space

ACM SIGAPP Applied Computing Review
Exchange and consumption of huge RDF data

ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Increasingly huge RDF data sets are being published on the Web. Currently, they use different syntaxes of RDF, contain high levels of redundancy and have a plain indivisible structure. All this leads to fuzzy publications, inefficient management, complex processing and lack of scalability. This paper presents a novel RDF representation (HDT) which takes advantage of the structural properties of RDF graphs for splitting and representing, efficiently, three components of RDF data: Header, Dictionary and Triples structure. On-demand management operations can be implemented on top of HDT representation. Experiments show that data sets can be compacted in HDT by more than fifteen times the current naive representation, improving parsing and processing while keeping a consistent publication scheme. For exchanging, specific compression techniques over HDT improve current compression solutions.