A Class-Based Search System in Unstructured P2P Networks

  • Authors:
  • Juncheng Huang;Xiuqi Li;Jie Wu

  • Affiliations:
  • Florida Atlantic University;Florida Atlantic University;Florida Atlantic University

  • Venue:
  • AINA '07 Proceedings of the 21st International Conference on Advanced Networking and Applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Efficient searching is one of the important design issues in peer-to-peer (P2P) networks. Among various searching techniques, semantic-based searching has drawn significant attention recently. Gnutella-like efficient searching system (GES) [18] is such a system. GES derives a node vector, a semantic summary of all of the documents on a node, based on vector space model (VSM). The topology adaptation algorithm and search protocol are then designed according to the similarity between node vectors of different nodes. However, although GES is suitable when the distribution of documents in each node is uniform, it may not be efficient when the distribution is diverse. When there are many categories of documents at each node, the node vector representation may be inaccurate. We extend the idea of GES and present a class-based semantic searching system (CSS). It makes use of a data clustering algorithm, online spherical kmeans clustering (OSKM) [16], to cluster all documents on a node into several classes. Each class can be viewed as a virtual node. Virtual nodes are connected through virtual links. As a result, class vector replaces node vector and plays an important role in the class-based topology adaptation and search process, which makes CSS very efficient. Our simulation using the IR benchmark TREC collection demonstrates that CSS outperforms GES in terms of higher recall, higher precision and lower search cost.