Two-Dimensional Distributed Inverted Files

  • Authors:
  • Esteban Feuerstein;Mauricio Marin;Michel Mizrahi;Veronica Gil-Costa;Ricardo Baeza-Yates

  • Affiliations:
  • Departamento de Computación, FCEyN, Universidad de Buenos Aires, Argentina;Yahoo! Research Latin America, Santiago, Chile;Departamento de Computación, FCEyN, Universidad de Buenos Aires, Argentina;Yahoo! Research Latin America, Santiago, Chile;Yahoo! Research Latin America, Santiago, Chile

  • Venue:
  • SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Term-partitioned indexes are generally inefficient for the evaluation of conjunctive queries, as they require the communication of long posting lists. On the other side, document-partitioned indexes incur in excessive overheads as the evaluation of every query involves the participation of all the processors, therefore their scalability is not adequate for real systems. We propose to arrange a set of processors in a two-dimensional array, applying term-partitioning at row level and document-partitioning at column level. Choosing the adequate number of rows and columns given the available number of processors, together with the selection of the proper ways of partitioning the index over that topology is the subject of this paper.