Efficient indexing of web pages using PR+ trees

  • Authors:
  • Bhaskar Biswas;Karan Jain;K. K. Shukla

  • Affiliations:
  • Banaras Hindu University, Varanasi;Banaras Hindu University, Varanasi;Banaras Hindu University, Varanasi

  • Venue:
  • Proceedings of the International Conference on Advances in Computing, Communication and Control
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

R-trees have been proposed to handle spatial datasets as required in computer aided design and geo-data applications. Many proposals regarding variants of R-trees have also been proposed in order to improve the efficiency of the working algorithm for Insertion, Deletion and Search operations. Exploiting the fact that spatial dataset contains objects that may/may not overlap and the fact that a web page can be visualized so as to look like a set of touching rectangles. Owing to this fact, we propose a yet another variant using existing concepts & variants of R-trees which leads to efficient hashing of web pages. Besides on application of such an algorithm on a cluster containing structurally similar pages, one can save a lot of time spent in parsing, indexing and generating hash for the cluster. Thereby, reducing the overall complexity of generating a Universal Web Wrapper.