Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Introduction to algorithms
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Opportunistic data structures with applications
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Cache-oblivious string B-trees
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
ACM Computing Surveys (CSUR)
Scalable semantic web data management using vertical partitioning
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Hexastore: sextuple indexing for semantic web data management
Proceedings of the VLDB Endowment
The RDF-3X engine for scalable management of RDF data
The VLDB Journal — The International Journal on Very Large Data Bases
Organization and maintenance of large ordered indices
SIGFIDET '70 Proceedings of the 1970 ACM SIGFIDET (now SIGMOD) Workshop on Data Description, Access and Control
The compressed permuterm index
ACM Transactions on Algorithms (TALG)
Compact representation of large RDF data sets for publishing and exchange
ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Compressed string dictionaries
SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Data Management and Query Processing in Semantic Web Databases
Data Management and Query Processing in Semantic Web Databases
Binary RDF for scalable publishing, exchanging and consumption in the web of data
Proceedings of the 21st international conference companion on World Wide Web
Exchange and consumption of huge RDF data
ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
Hi-index | 0.00 |
The use of dictionaries is a common practice among those applications performing on huge RDF datasets. It allows long terms occurring in the RDF triples to be replaced by short IDs which reference them. This decision greatly compacts the dataset and thus mitigates its scalability issues. However, the dictionary size is not negligible and the techniques used for its representation also suffer from scalability limitations. This paper focuses on this scenario by adapting compression techniques for string dictionaries to the case of RDF. We propose a novel technique: Dcomp, which can be tuned to represent the dictionary in compressed space (22--64%) and to perform in a few microseconds (1--50μs).