Self-similarity in the web

  • Authors:
  • Stephen Dill;Ravi Kumar;Kevin S. Mccurley;Sridhar Rajagopalan;D. Sivakumar;Andrew Tomkins

  • Affiliations:
  • IBM Almaden Research Center, San Jose, CA;IBM Almaden Research Center, San Jose, CA;IBM Almaden Research Center, San Jose, CA;IBM Almaden Research Center, San Jose, CA;IBM Almaden Research Center, San Jose, CA;IBM Almaden Research Center, San Jose, CA

  • Venue:
  • ACM Transactions on Internet Technology (TOIT)
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Algorithmic tools for searching and mining the Web are becoming increasingly sophisticated and vital. In this context, algorithms that use and exploit structural information about the Web perform better than generic methods in both efficiency and reliability.We present an extensive characterization of the graph structure of the Web, with a view to enabling high-performance applications that make use of this structure. In particular, we show that the Web emerges as the outcome of a number of essentially independent stochastic processes that evolve at various scales. A striking consequence of this scale invariance is that the structure of the Web is "fractal"---cohesive subregions display the same characteristics as the Web at large. An understanding of this underlying fractal nature is therefore applicable to designing data services across multiple domains and scales.We describe potential applications of this line of research to optimized algorithm design for Web-scale data analysis.