BloomCast: Efficient Full-Text Retrieval over Unstructured P2Ps with Guaranteed Recall
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
PISA: Federated Search in P2P Networks with Uncooperative Peers
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
HAPS: supporting effective and efficient full-text P2P search with peer dynamics
Journal of Computer Science and Technology
PISA: A framework for integrating uncooperative peers into P2P-based federated search
Computer Communications
Foundations and Trends in Information Retrieval
Search result caching in peer-to-peer information retrieval networks
IRFC'11 Proceedings of the Second international conference on Multidisciplinary information retrieval facility
Peer-to-Peer Information Retrieval: An Overview
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.01 |
Peer-to-peer (P2P) networks integrate autonomous computing resources without requiring a central coordinating authority, which makes them a potentially robust and scalable model for providing federated search capability to large-scale networks of text digital libraries. However, P2P networks have so far mostly used simple search techniques based on document names or controlled-vocabulary terms, and provided very limited support for full-text search of document contents. This dissertation provides solutions to full-text federated search with relevance-based document ranking within an integrated framework of P2P network overlay, search, and evolution models. Previous notions of P2P network architectures are extended to define a network overlay model with desired content distribution and navigability. Existing approaches to federated search are adapted, and new methods are developed for resource representation, resource selection, and result merging in a network search model according to the unique characteristics of P2P networks. Furthermore, autonomous and decentralized algorithms to evolve the network topology into one with desired search-enhancing properties are proposed in a network evolution model to facilitate effective and efficient full-text federated search in dynamic environments. To demonstrate that the proposed solutions are both effective and practical, two P2P testbeds consisting of thousands of real-content text digital libraries and hundreds of thousands of automatically generated queries are developed. Evaluation using these testbeds provides strong empirical evidence that the approaches proposed in this dissertation provide a better combination of accuracy, efficiency and robustness than more common alternatives.