Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Introduction to algorithms
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Opportunistic data structures with applications
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Cache-oblivious string B-trees
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
ACM Computing Surveys (CSUR)
Scalable semantic web data management using vertical partitioning
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Hexastore: sextuple indexing for semantic web data management
Proceedings of the VLDB Endowment
The RDF-3X engine for scalable management of RDF data
The VLDB Journal — The International Journal on Very Large Data Bases
Organization and maintenance of large ordered indices
SIGFIDET '70 Proceedings of the 1970 ACM SIGFIDET (now SIGMOD) Workshop on Data Description, Access and Control
The compressed permuterm index
ACM Transactions on Algorithms (TALG)
Compact representation of large RDF data sets for publishing and exchange
ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Compressed string dictionaries
SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Data Management and Query Processing in Semantic Web Databases
Data Management and Query Processing in Semantic Web Databases
Learning SPARQL
Indexing Sequences of IEEE 754 Double Precision Numbers
DCC '12 Proceedings of the 2012 Data Compression Conference
Exchange and consumption of huge RDF data
ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
Binary RDF representation for publication and exchange (HDT)
Web Semantics: Science, Services and Agents on the World Wide Web
Hi-index | 0.00 |
The use of dictionaries is a common practice among those applications performing on huge RDF datasets. It allows long terms occurring in the RDF triples to be replaced by short IDs which reference them. This decision greatly compacts the dataset and mitigates the scalability issues underlying to its management. However, the dictionary size is not negligible and the techniques used for its representation also suffer from scalability limitations. This paper focuses on this scenario by adapting compression techniques for string dictionaries to the case of RDF. We propose a novel technique: Dcomp, which can be tuned to represent the dictionary in compressed space (22--64%) and to perform basic lookup operations in a few microseconds (1--50μs). In addition, we propose Dcomp as a basis for specific SPARQL query optimizations leveraging its ability for early FILTER resolution.