Compression of RDF dictionaries

Authors:
Miguel A. Martínez-Prieto;Javier D. Fernández;Rodrigo Cánovas
Affiliations:
Univ. of Valladolid, Spain and Univ. of Chile, Chile;Univ. of Valladolid, Spain and Univ. of Chile, Chile;Univ. of Melbourne, Australia and Univ. of Chile, Chile
Venue:
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Year:
2012

Citing 14
Cited 2

Managing gigabytes (2nd ed.): compressing and indexing documents and images

Managing gigabytes (2nd ed.): compressing and indexing documents and images
Introduction to algorithms

Introduction to algorithms
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Opportunistic data structures with applications

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Cache-oblivious string B-trees

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Compressed full-text indexes

ACM Computing Surveys (CSUR)
Scalable semantic web data management using vertical partitioning

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Hexastore: sextuple indexing for semantic web data management

Proceedings of the VLDB Endowment
The RDF-3X engine for scalable management of RDF data

The VLDB Journal — The International Journal on Very Large Data Bases
Organization and maintenance of large ordered indices

SIGFIDET '70 Proceedings of the 1970 ACM SIGFIDET (now SIGMOD) Workshop on Data Description, Access and Control
The compressed permuterm index

ACM Transactions on Algorithms (TALG)
Compact representation of large RDF data sets for publishing and exchange

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Compressed string dictionaries

SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Data Management and Query Processing in Semantic Web Databases

Data Management and Query Processing in Semantic Web Databases

Binary RDF for scalable publishing, exchanging and consumption in the web of data

Proceedings of the 21st international conference companion on World Wide Web
Exchange and consumption of huge RDF data

ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

The use of dictionaries is a common practice among those applications performing on huge RDF datasets. It allows long terms occurring in the RDF triples to be replaced by short IDs which reference them. This decision greatly compacts the dataset and thus mitigates its scalability issues. However, the dictionary size is not negligible and the techniques used for its representation also suffer from scalability limitations. This paper focuses on this scenario by adapting compression techniques for string dictionaries to the case of RDF. We propose a novel technique: Dcomp, which can be tuned to represent the dictionary in compressed space (22--64%) and to perform in a few microseconds (1--50μs).