Lossless Compression Based on the Sequence Memoizer

Authors:
Jan Gasthaus;Frank Wood;Yee Whye Teh
Affiliations:
-;-;-
Venue:
DCC '10 Proceedings of the 2010 Data Compression Conference
Year:
2010

Citing 0
Cited 3

The sequence memoizer

Communications of the ACM
Hierarchical Bayesian Nonparametric Approach to Modeling and Learning the Wisdom of Crowds of Urban Traffic Route Planning Agents

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Intelligent Cooperative Control for Urban Tracking

Journal of Intelligent and Robotic Systems

Quantified Score

Hi-index	0.02

Visualization

Abstract

In this work we describe a sequence compression method based on combining a Bayesian nonparametric sequence model with entropy encoding. The model, a hierarchy of Pitman-Yor processes of unbounded depth previously proposed by Wood et al. [16] in the context of language modelling, allows modelling of long-range dependencies by allowing conditioning contexts of unbounded length. We show that incremental approximate inference can be performed in this model, thereby allowing it to be used in a text compression setting. The resulting compressor reliably outperforms several PPM variants on many types of data, but is particularly effective in compressing data that exhibits power law properties.