Burst tries: a fast, efficient data structure for string keys
ACM Transactions on Information Systems (TOIS)
In-memory hash tables for accumulating text vocabularies
Information Processing Letters
Parallel algorithms for the static dictionary compression
DCC '95 Proceedings of the Conference on Data Compression
Map-reduce-merge: simplified relational data processing on large clusters
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Scalable semantic web data management using vertical partitioning
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Web document compaction by compressing URI references in RDF and OWL data
Proceedings of the 2nd international conference on Ubiquitous information management and communication
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Hexastore: sextuple indexing for semantic web data management
Proceedings of the VLDB Endowment
Hive: a warehousing solution over a map-reduce framework
Proceedings of the VLDB Endowment
Scalable Distributed Reasoning Using MapReduce
ISWC '09 Proceedings of the 8th International Semantic Web Conference
LUBM: A benchmark for OWL knowledge base systems
Web Semantics: Science, Services and Agents on the World Wide Web
RDF compression: basic approaches
Proceedings of the 19th international conference on World wide web
Massive Semantic Web data compression with MapReduce
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
OWLIM – a pragmatic semantic repository for OWL
WISE'05 Proceedings of the 2005 international conference on Web Information Systems Engineering
OWL reasoning with WebPIE: calculating the closure of 100 billion triples
ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part I
Dictionary design for text image compression with JBIG2
IEEE Transactions on Image Processing
Special Issue: MapReduce and its Applications
Concurrency and Computation: Practice & Experience
Hi-index | 0.00 |
The Semantic Web contains many billions of statements, which are released using the resource description framework (RDF) data model. To better handle these large amounts of data, high performance RDF applications must apply a compression technique. Unfortunately, because of the large input size, even this compression is challenging. In this paper, we propose a set of distributed MapReduce algorithms to efficiently compress and decompress a large amount of RDF data. Our approach uses a dictionary encoding technique that maintains the structure of the data. We highlight the problems of distributed data compression and describe the solutions that we propose. We have implemented a prototype using the Hadoop framework, and evaluate its performance. We show that our approach is able to efficiently compress a large amount of data and scales linearly on both input size and number of nodes. Copyright © 2012 John Wiley & Sons, Ltd.