Efficient Common Items Extraction from Multiple Sorted Lists

  • Authors:
  • Wei Lu;Chuitian Rong;Jinchuan Chen;Xiaoyong Du;Gabriel Pui Cheong Fung;Xiaofang Zhou

  • Affiliations:
  • -;-;-;-;-;-

  • Venue:
  • APWEB '10 Proceedings of the 2010 12th International Asia-Pacific Web Conference
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given a set of lists, where items of each list are sorted by the ascending order of their values, the objective of this paper is to figure out the common items that appear in all of the lists efficiently. This problem is sometimes known as common items extraction from sorted lists. To solve this problem, one common approach is to scan all items of all lists sequentially in parallel until one of the lists is exhausted. However, we observe that if the overlap of items across all lists is not high, such sequential access approach can be significantly improved. In this paper, we propose two algorithms, MergeSkip and MergeESkip, to solve this problem by taking the idea of skipping as many items of lists as possible. As a result, a large number of comparisons among items can be saved, and hence the efficiency can be improved. We conduct extensive analysis of our proposed algorithms on one real dataset and two synthetic datasets with different data distributions. We report all our findings in this paper.