Characterization of the Thai hostgraph

  • Authors:
  • Kulwadee Somboonviwat;Masashi Toyoda;Shinji Suzuki;Masaru Kitsuregawa

  • Affiliations:
  • The University of Tokyo;The University of Tokyo;The University of Tokyo;The University of Tokyo

  • Venue:
  • Proceedings of the 2nd international conference on Ubiquitous information management and communication
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Web of a country or the national Web is a set of web pages related to a specific country. Understanding in the graph structure of the national Web provides invaluable insights for the development of algorithms and localized search services targeting for a specific country. Many empirical studies on the graph structure of the national Webs have been done at the level of individual web pages. However, in reality, the Web information is being organized into a hierarchically nested structure, called a domain name system. The domain name based hierarchical structure adds the intermediate levels of entities and administrative control to the Web. To better understand the characteristics and ecology of the national Web, it is necessary to also understand its graph structure at a more abstract level. In this paper we put our attention to the graph structure of the Web at the level of interconnection between hosts in the Thai Web. The hostgraph is a directed graph with a node corresponding to a host and a directed weighted edge corresponding to the number of links between a pair of hosts. We report various graphical properties of the Thai hostgraph based on a snapshot of the Thai Web obtained in January 2007. For each empirical result, we carefully interpret its implications and discuss how to put it into practical use. We also give an example application of the hostgraph i.e. mining web community from the Thai hostgraph.