Searching the peer-to-peer networks: the community and their queries

  • Authors:
  • Sai Ho Kwok;Christopher C. Yang

  • Affiliations:
  • Department of Information and Systems Management, The Hong Kong University of Science and Technology;Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong

  • Venue:
  • Journal of the American Society for Information Science and Technology - Special issue: Part II: Information seeking research
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Peer-to-Peer (P2P) networks provide a new distributed computing paradigm on the Internet for file sharing. The decentralized nature of P2P networks fosters cooperative and non-cooperative behaviors in sharing resources. Searching is a major component of P2P file sharing. Several studies have been reported on the nature of queries of World Wide Web (WWW) search engines, but studies on queries of P2P networks have not been reported yet. In this report, we present our study on the Gnutella network, a decentralized and unstructured P2P network. We found that the majority of Gnutella users are located in the United States. Most queries are repeated. This may be because the hosts of the target files connect or disconnect from the network any time, so clients resubmit their queries. Queries are also forwarded from peers to peers. Findings are compared with the data from two other studies of Web queries. The length of queries in the Gnutella network is longer than those reported in the studies of WWW search engines. Queries with the highest frequency are mostly related to the names of movies, songs, artists, singers, and directors. Terms with the highest frequency are related to file formats, entertainment, and sexuality. This study is important for the future design of applications, architecture, and services of P2P networks.