Introduction to parallel computing: design and analysis of algorithms
Introduction to parallel computing: design and analysis of algorithms
Algorithm 805: computation and uses of the semidiscrete matrix decomposition
ACM Transactions on Mathematical Software (TOMS)
OceanStore: an architecture for global-scale persistent storage
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Wide-area cooperative storage with CFS
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Using LSI for text classification in the presence of background text
Proceedings of the tenth international conference on Information and knowledge management
Squirrel: a decentralized peer-to-peer web cache
Proceedings of the twenty-first annual symposium on Principles of distributed computing
Kademlia: A Peer-to-Peer Information System Based on the XOR Metric
IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Serving DNS Using a Peer-to-Peer Lookup Service
IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Towards an Archival Intermemory
ADL '98 Proceedings of the Advances in Digital Libraries Conference
Peer-to-peer information retrieval using self-organizing semantic overlay networks
Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
Gossip-Based Computation of Aggregate Information
FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Semantic Small World: An Overlay Network for Peer-to-Peer Search
ICNP '04 Proceedings of the 12th IEEE International Conference on Network Protocols
Compression, Clustering, and Pattern Discovery in Very High-Dimensional Discrete-Attribute Data Sets
IEEE Transactions on Knowledge and Data Engineering
Pastiche: making backup cheap and easy
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
Nonorthogonal decomposition of binary matrices for bounded-error data compression and analysis
ACM Transactions on Mathematical Software (TOMS)
A Semantic Overlay for Self- Peer-to-Peer Publish/Subscribe
ICDCS '06 Proceedings of the 26th IEEE International Conference on Distributed Computing Systems
Distributed Data Mining in Peer-to-Peer Networks
IEEE Internet Computing
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Scribe: a large-scale and decentralized application-level multicast infrastructure
IEEE Journal on Selected Areas in Communications
Semantic routing of search queries in P2P networks
Journal of Parallel and Distributed Computing
A scalable multi-attribute range query approach on cluster-based hybrid overlays
MTPP'10 Proceedings of the Second Russia-Taiwan conference on Methods and tools of parallel programming multicomputers
On the design of semi-structured multi-star hybrid-overlays for multi-attribute range queries
GPC'10 Proceedings of the 5th international conference on Advances in Grid and Pervasive Computing
Hi-index | 0.00 |
The past few years have seen tremendous advances in distributed storage infrastructure. Unstructured and structured overlay networks have been successfully used in a variety of applications, ranging from file-sharing to scientific data repositories. While unstructured networks benefit from low maintenance overhead, the associated search costs are high. On the other hand, structured networks have higher maintenance overheads, but facilitate bounded time search of installed keywords. When dealing with typical data sets, though, it is infeasible to install every possible search term as a keyword into the structured overlay. State-of-the art semantic indexing techniques have been successfully integrated into peer-to-peer (P2P) systems using semantic overlays. However, exiting approaches are based on the premise that the fundamental ingredient of semantic indexing, a semantic basis for the underlying data, is globally available, which is not likely to be the case in practice. Therefore, development of techniques to efficiently compute basis vectors for data distributed across peers is important for large-scale deployment of semantic indexing in P2P systems. In this paper, we present a novel structured overlay that integrates aspects of semantic indexing using non-orthogonal matrix decompositions, with the hash structure of the overlay. We adopt PROXIMUS, a recursive decomposition method for computing concise representations for binary data sets, to locally identify latent patterns in data distributed across peers. To enable efficient consolidation of patterns, we rely on distributed hash tables (DHT), commonly used in various applications in P2P networks. The discrete nature of non-orthogonal matrix decomposition is well suited to the binary key structure of DHTs, resulting in an indexing method, PMINER, that enables the network to deliver efficient and accurate responses to semantic queries. We present the algorithmic underpinnings of PMINER and demonstrate its excellent performance characteristics on real, as well as synthetic data sets.