Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
A low-bandwidth network file system
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Wide-area cooperative storage with CFS
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Venti: A New Approach to Archival Storage
FAST '02 Proceedings of the Conference on File and Storage Technologies
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
SplitStream: high-bandwidth multicast in cooperative environments
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Measurement, modeling, and analysis of a peer-to-peer file-sharing workload
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
An integrated experimental environment for distributed systems and networks
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Pastiche: making backup cheap and easy
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
OpenDHT: a public DHT service and its uses
Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
Rarest first and choke algorithms are enough
Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Maintaining high bandwidth under dynamic network conditions
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Alternatives for detecting redundancy in storage systems data
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Design, implementation, and evaluation of duplicate transfer detection in HTTP
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Shark: scaling file servers via cooperative caching
NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
CoDNS: improving DNS performance and reliability via cooperative lookups
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Finding similar files in a large file system
WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
Scale and performance in the CoBlitz large-file distribution service
NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
An architecture for internet data transfer
NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Approximate object location and spam filtering on peer-to-peer systems
Proceedings of the ACM/IFIP/USENIX 2003 International Conference on Middleware
Supporting practical content-addressable caching with CZIP compression
ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
One hop reputations for peer to peer file sharing workloads
NSDI'08 Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation
Adaptive file transfers for diverse environments
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Ditto: a system for opportunistic caching in multi-hop wireless networks
Proceedings of the 14th ACM international conference on Mobile computing and networking
Redundancy in network traffic: findings and implications
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Antfarm: efficient content distribution with managed swarms
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
Modeling and emulation of internet paths
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
SmartRE: an architecture for coordinated network-wide redundancy elimination
Proceedings of the ACM SIGCOMM 2009 conference on Data communication
Experimental study of protocol-independent redundancy elimination algorithms
Proceedings of the first joint WOSP/SIPEW international conference on Performance engineering
Efficient similarity estimation for systems exploiting data redundancy
INFOCOM'10 Proceedings of the 29th conference on Information communications
EndRE: an end-system redundancy elimination service for enterprises
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
Cheap and large CAMs for high performance data-intensive networked systems
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
Wide-area network acceleration for the developing world
USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
Improving audio files availability in file sharing networks
WebMedia '09 Proceedings of the XV Brazilian Symposium on Multimedia and the Web
A case for information-bound referencing
Hotnets-IX Proceedings of the 9th ACM SIGCOMM Workshop on Hot Topics in Networks
Balancing throughput, robustness, and in-order delivery in P2P VoD
Proceedings of the 6th International COnference
Efficient incremental code update for sensor networks
ACM Transactions on Sensor Networks (TOSN)
A comparative study of handheld and non-handheld traffic in campus Wi-Fi networks
PAM'11 Proceedings of the 12th international conference on Passive and active measurement
Inter-datacenter bulk transfers with netstitcher
Proceedings of the ACM SIGCOMM 2011 conference
Providing hierarchical lookup service for P2P-VoD systems
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) - Special Issue on P2P Streaming
An empirical analysis of similarity in virtual machine images
Proceedings of the Middleware 2011 Industry Track Workshop
Suppressing redundancy in wireless sensor network traffic
DCOSS'10 Proceedings of the 6th IEEE international conference on Distributed Computing in Sensor Systems
Shredder: GPU-accelerated incremental storage and computation
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
Probabilistic deduplication for cluster-based storage systems
Proceedings of the Third ACM Symposium on Cloud Computing
SIMPLE-fying middlebox policy enforcement using SDN
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
An information-aware QoE-centric mobile video cache
Proceedings of the 19th annual international conference on Mobile computing & networking
Enhancing video accessibility and availability using information-bound references
Proceedings of the ninth ACM conference on Emerging networking experiments and technologies
Hi-index | 0.00 |
Many contemporary approaches for speeding up large file transfers attempt to download chunks of a data object from multiple sources. Systems such as BitTorrent quickly locate sources that have an exact copy of the desired object, but they are unable to use sources that serve similar but non-identical objects. Other systems automatically exploit cross-file similarity by identifying sources for each chunk of the object. These systems, however, require a number of lookups proportional to the number of chunks in the object and a mapping for each unique chunk in every identical and similar object to its corresponding sources. Thus, the lookups and mappings in such a system can be quite large, limiting its scalability. This paper presents a hybrid system that provides the best of both approaches, locating identical and similar sources for data objects using a constant number of lookups and inserting a constant number of mappings per object. We first demonstrate through extensive data analysis that similarity does exist among objects of popular file types, and that making use of it can sometimes substantially improve download times. Next, we describe handprinting, a technique that allows clients to locate similar sources using a constant number of lookups and mappings. Finally, we describe the design, implementation and evaluation of Similarity-Enhanced Transfer (SET), a system that uses this technique to download objects. Our experimental evaluation shows that by using sources of similar objects, SET is able to significantly out-perform an equivalently configured BitTorrent.