A contextual analysis of the YouTube duplicate content

Authors:
Tiago Rodrigues;Fabrício Benevenuto;Virgílio Almeida;Jussara Almeida;Marcos Gonçalves
Affiliations:
UFMG, Belo Horizonte/Brasil;UFMG, Belo Horizonte/Brasil;UFMG, Belo Horizonte/Brasil;UFMG, Belo Horizonte/Brasil;UFMG, Belo Horizonte/Brasil
Venue:
WebMedia '09 Proceedings of the XV Brazilian Symposium on Multimedia and the Web
Year:
2009

Citing 15
Cited 1

Aliasing on the world wide web: prevalence and performance implications

Proceedings of the 11th international conference on World Wide Web
Information Retrieval

Information Retrieval
Modern Information Retrieval

Modern Information Retrieval
Combating spam in tagging systems

AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Practical elimination of near-duplicates from web video search

Proceedings of the 15th international conference on Multimedia
I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system

Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
Youtube traffic characterization: a view from the edge

Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
Fighting Spam on Social Web Sites: A Survey of Approaches and Future Challenges

IEEE Internet Computing
Tag-based social interest discovery

Proceedings of the 17th international conference on World Wide Web
Identifying video spammers in online social networks

AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Social tags: meaning and suggestions

Proceedings of the 17th ACM conference on Information and knowledge management
Near-duplicate keyframe retrieval by nonrigid image matching

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Understanding video interactions in youtube

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Detecting spammers and content promoters in online video social networks

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Video interactions in online video social networks

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)

LêbrailleTWT: providing visual accessibility to twitter on touchscreen devices

Proceedings of the 18th Brazilian symposium on Multimedia and the web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Videos have become a predominant part of users' daily lives on the Web, especially with the emergence of online video social networks such as YouTube. Since users can independently share videos in these systems, some videos can be duplicates (i.e., identical or very similar videos). Despite having the same content, there are some potential differences in duplicates, for example, in their associated metadata (i.e., tags, title) and their popularity scores (i.e., number of views, comments). Quantifying these differences is important for three reasons. The first is related to the necessity of understanding how users associate metadata to videos on YouTube, which is crucial for video information retrieval mechanisms and recommendation systems. The second is associated with understanding possible reasons that influence on the popularity of videos, essential to the association of advertisements to videos and performance issues related to the use of caches and CDNs. The third comes from the necessity to detect opportunistic actions, which pollute and compromise the use of the system. This work presents a wide characterization of the differences among identical contents in online video sharing systems. Using a large video sample collected from YouTube, we construct a data set of duplicates. Besides quantifying contextual differences among duplicates, our results also reveal the presence of suspect behavior in the creation and association of metadata to videos.