A fast algorithm for computing longest common subsequences
Communications of the ACM
A low-bandwidth network file system
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Compactly encoding unstructured inputs with differential compression
Journal of the ACM (JACM)
Engineering a Differencing and Compression Data Format
ATEC '02 Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference
ActiveSync, TCP/IP and 802.11b Wireless Vulnerabilities of WinCE-Based PDAs
WETICE '02 Proceedings of the 11th IEEE International Workshops on Enabling Technologies: nfrastructure for Collaborative Enterprises
Improving duplicate elimination in storage systems
ACM Transactions on Storage (TOS)
Redundancy elimination within large collections of files
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Alternatives for detecting redundancy in storage systems data
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
TAPER: tiered approach for eliminating redundancy in replica synchronization
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
Version Control with Subversion
Version Control with Subversion
Venti: a new approach to archival storage
FAST'02 Proceedings of the 1st USENIX conference on File and storage technologies
Hi-index | 0.00 |
This paper presents a two-phase synchronization algorithm—tpsync, which combines content-defined chunking (CDC) with sliding block duplicated data detection methods tpsync firstly partitions synchronized files into variable-sized chunks in coarse-grained scale with CDC method, locates the unmatched chunks of synchronized files using the edit distance algorithm, and finally generates the fine-grained delta data with fixed-sized sliding block duplicated data detection method At the first-phase, tpsync can quickly locate the partial changed chunks between two files through similar files' fingerprint characteristics On the basis of the first phase's results, small fixed-sized sliding block duplicated data detection method can produce better fine-grained delta data between the corresponding unmatched data chunks further Extensive experiments on ASCII, binary and database files demonstrate that tpsync can achieve a higher performance on synchronization time and total transferred data compared to traditional fixed-sized sliding block method—rsync Compared to rsync, tpsync reduces synchronization time by 12% and bandwidth by 18.9% on average if optimized parameters are applied on both With signature cached synchronization method adopted, tpsync can yield a better performance.