MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 2)
Content-Based Image Retrieval at the End of the Early Years
IEEE Transactions on Pattern Analysis and Machine Intelligence
Duplicate detection in consumer photography and news video
Proceedings of the tenth ACM international conference on Multimedia
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Finding Near-Replicas of Documents and Servers on the Web
WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Watermarking scheme evaluation tool
MSE '00 Proceedings of the 2000 International Conference on Microelectronic Systems Education
Generic image classification using visual knowledge on the web
MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
An efficient parts-based near-duplicate and sub-image retrieval system
Proceedings of the 12th annual ACM international conference on Multimedia
Detecting image near-duplicate by stochastic attributed relational graph matching with learning
Proceedings of the 12th annual ACM international conference on Multimedia
The SPIRIT collection: an overview of a large web collection
ACM SIGIR Forum
Enhanced Perceptual Distance Functions and Indexing for Image Replica Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Efficient Image Matching with Distributions of Local Invariant Features
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Content-based image retrieval: approaches and trends of the new age
Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval
Pruning SIFT for scalable near-duplicate image matching
ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
PCA-SIFT: a more distinctive representation for local image descriptors
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
A DWT-DFT composite watermarking scheme robust to both affine transform and JPEG compression
IEEE Transactions on Circuits and Systems for Video Technology
Clustering near-duplicate images in large collections
Proceedings of the international workshop on Workshop on multimedia information retrieval
Finding near-duplicate images on the web using fingerprints
MM '08 Proceedings of the 16th ACM international conference on Multimedia
Large scale image copy detection evaluation
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Caching content-based queries for robust and efficient image retrieval
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Scaling content-based video copy detection to very large databases
Multimedia Tools and Applications
An axiomatic approach to measuring of information of sign-based image representations
Journal of Computer and Systems Sciences International
BASIL: effective near-duplicate image detection using gene sequence alignment
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Proceedings of the 20th ACM international conference on Multimedia
Hi-index | 0.00 |
Among the vast numbers of images on the web are many duplicates and near-duplicates, that is, variants derived from the same original image. Such near-duplicates appear in many web image searches and may represent infringements of copyright or indicate the presence of redundancy. While methods for identifying near-duplicates have been investigated, there has been no analysis of the kinds of alterations that are common on the web or evaluation of whether real cases of near-duplication can in fact be identified. In this paper we use popular queries and a commercial image search service to collect images that we then manually analyse for instances of near-duplication. We show that such duplication is indeed significant, but that not all kinds of image alteration explored in previous literature are evident in web data. Removal of near-duplicates from a collection is impractical, but we propose that they be removed from sets of answers. We evaluate our technique for automatic identification of near duplicates during query evaluation and show that it has promise as an effective mechanism for management of near-duplication in practice.