Automatic detection of local reuse

  • Authors:
  • Arno Mittelbach;Lasse Lehmann;Christoph Rensing;Ralf Steinmetz

  • Affiliations:
  • KOM - Multimedia Communications Lab, Technische Universität Darmstadt, Darmstadt;KOM - Multimedia Communications Lab, Technische Universität Darmstadt, Darmstadt;KOM - Multimedia Communications Lab, Technische Universität Darmstadt, Darmstadt;KOM - Multimedia Communications Lab, Technische Universität Darmstadt, Darmstadt

  • Venue:
  • EC-TEL'10 Proceedings of the 5th European conference on Technology enhanced learning conference on Sustaining TEL: from innovation to learning and practice
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Local reuse detection is a prerequisite for a multitude of tasks ranging from document management and information retrieval to web search or plagiarism detection. Its results can be used to support authors in creating new learning resources or learners in finding existing ones by providing accurate suggestions for related documents. While the detection of local text reuse, i.e. reuse of parts of documents, is covered by various approaches, reuse detection for object-based documents has been hardly considered yet. In this paper we propose a new fingerprinting technique for local reuse detection for both text-based and object-based documents which is based on the contiguity of documents. This additional information, which is generally disregarded by existing approaches, allows the creation of shorter and more flexible fingerprints. Evaluations performed on different corpora have shown that it performs better than existing approaches while maintaining a significantly lower storage consumption.