Flexible information discovery with guarantees in decentralized distributed systems

  • Authors:
  • Cristina Simona Schmidt;Manish Parashar

  • Affiliations:
  • Rutgers The State University of New Jersey - New Brunswick;Rutgers The State University of New Jersey - New Brunswick

  • Venue:
  • Flexible information discovery with guarantees in decentralized distributed systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent years have seen increasing interest in Peer-to-Peer (P2P) information sharing environments. The P2P computing paradigm enables entities at the edges of the network to directly interact as equals (or peers) and share information, services and resources without centralized servers. Key characteristics of these systems include decentralization, self-organization, dynamism and fault-tolerance, which make them naturally scalable and attractive solutions for information sharing and discovery applications. The ability to efficiently discover information using partial knowledge (for example queries containing keywords, wildcards and ranges) is an important and challenging issue in large, decentralized, distributed sharing environments such as P2P systems and Computational Grids. Existing P2P information discovery systems implement trade-offs. Unstructured search systems are easy to maintain and allow complex queries but do not offer any guarantees. Lookup systems offer guarantees but at the cost of maintaining a complex and constrained structure and supporting only search based on exact identifiers. Finally, lookup systems enhanced with keyword searches offer guarantees, but the queries supported are still not expressive enough. In this research we present an innovative approach to building a P2P information discovery system that provides the flexibility of keyword search systems while providing the guarantees and bounds of data lookup systems. The fundamental concept underlying our approach is the definition of multidimensional information (keyword) spaces and the maintenance of locality in these spaces. The key innovation is a dimensionality reducing indexing scheme that effectively maps the multidimensional information space to physical peers, while preserving lexical locality. The presented system guarantees that all existing data elements matching a query are found with reasonable costs in terms of number of messages and nodes involved. Complex queries containing partial keywords, wildcards and ranges are supported. The design, analysis, implementation and an experimental evaluation of the system are presented.