Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Timestamped whole program path representation and its applications
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Efficient representations and abstractions for quantifying and exploiting data reference locality
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Dynamic hot data stream prefetching for general-purpose programs
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Whole program Path-Based dynamic impact analysis
Proceedings of the 25th International Conference on Software Engineering
Compressed Pattern Matching for Sequitur
DCC '01 Proceedings of the Data Compression Conference
Cost effective dynamic program slicing
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
An Empirical Comparison of Dynamic Impact Analysis Algorithms
Proceedings of the 26th International Conference on Software Engineering
Efficient Forward Computation of Dynamic Slices Using Reduced Ordered Binary Decision Diagrams
Proceedings of the 26th International Conference on Software Engineering
Using Compressed Bytecode Traces for Slicing Java Programs
Proceedings of the 26th International Conference on Software Engineering
VPC3: a fast and effective trace-compression algorithm
Proceedings of the joint international conference on Measurement and modeling of computer systems
Design space exploration of caches using compressed traces
Proceedings of the 18th annual international conference on Supercomputing
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Automatic Generation of High-Performance Trace Compressors
Proceedings of the international symposium on Code generation and optimization
Supporting efficient query processing on compressed XML files
Proceedings of the 2005 ACM symposium on Applied computing
Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Whole execution traces and their applications
ACM Transactions on Architecture and Code Optimization (TACO)
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
The VPC Trace-Compression Algorithms
IEEE Transactions on Computers
An efficient single-pass trace compression technique utilizing instruction streams
ACM Transactions on Modeling and Computer Simulation (TOMACS)
METRIC: Memory tracing via dynamic binary rewriting to identify cache inefficiencies
ACM Transactions on Programming Languages and Systems (TOPLAS)
Unified control flow and data dependence traces
ACM Transactions on Architecture and Code Optimization (TACO)
Dynamic slicing on Java bytecode traces
ACM Transactions on Programming Languages and Systems (TOPLAS)
Profiling Java programs for parallelism
IWMSE '09 Proceedings of the 2009 ICSE Workshop on Multicore Software Engineering
A holistic approach to managing software change impact
Journal of Systems and Software
Scalable Communication Trace Compression
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
An extended assessment of type-3 clones as detected by state-of-the-art tools
Software Quality Control
Elastic and scalable tracing and accurate replay of non-deterministic events
Proceedings of the 27th international ACM conference on International conference on supercomputing
Hi-index | 0.01 |
Data compression and learning are, in some sense, two sides of the same coin. If we paraphrase Occam's razor by saying that a small theory is better than a larger theory with the same explanatory power, we can characterize data compression as a preoccupation with small, and learning as a preoccupation with better. Nevill-Manning et al. (see Proc. Data Compression Conference, Los Alamitos, CA, p.244-253, 1994) presented an algorithm, since dubbed SEQUITUR, that presents both faces of the compression/learning coin. Its performance as a data compression scheme outstrips other dictionary schemes, and the structures that it learns from sequences as diverse as DNA and music are intuitively compelling. We present three new results that characterize SEQUITUR's computational and compression performance. First, we prove that SEQUITUR operates in time linear in n, the length of the input sequence, despite its ability to build a hierarchy as deep as log(n). Second, we show that a sequence can be compressed incrementally, improving on the non-incremental algorithm that was described by Nevill-Manning et al., and making on-line compression feasible. Third, we present an intriguing result that emerged during benchmarking; whereas PPMC outperforms SEQUITUR on most files in the Calgary corpus, SEQUITUR regains the lead when tested on multimegabyte sequences. We make some tentative conclusions about the underlying reasons for this phenomenon, and about the nature of current compression benchmarking.