The effectiveness of GIOSS for the text database discovery problem
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Searching distributed collections with inference networks
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Inferring Web communities from link topology
Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
Applications of intelligent agents
Agent technology
Evaluating database selection techniques: a testbed and experiment
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Comparing the performance of database selection algorithms
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Cluster-based language models for distributed retrieval
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Finding related pages in the World Wide Web
WWW '99 Proceedings of the eighth international conference on World Wide Web
An agent-based approach for building complex software systems
Communications of the ACM
Community-based service location
Communications of the ACM
Query-based sampling of text databases
ACM Transactions on Information Systems (TOIS)
Modeling score distributions for combining the outputs of search engines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Building efficient and effective metasearch engines
ACM Computing Surveys (CSUR)
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
Retrieving Information from a Distributed Heterogeneous Document Collection
Information Retrieval
Relevant document distribution estimation method for resource selection
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
SETS: search enhanced by topic segmentation
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Peer-to-peer information retrieval using self-organizing semantic overlay networks
Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
Comparing the performance of collection selection algorithms
ACM Transactions on Information Systems (TOIS)
A semisupervised learning method to merge search engine results
ACM Transactions on Information Systems (TOIS)
Ad Hoc, self-supervising peer-to-peer search networks
ACM Transactions on Information Systems (TOIS)
Lexical and semantic clustering by web links
Journal of the American Society for Information Science and Technology - Special issue: Webometrics
Improving collection selection with overlap awareness in P2P search engines
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Server selection methods in hybrid portal search
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Modeling search engine effectiveness for federated search
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Towards scatter/gather browsing in a hierarchical peer-to-peer network
Proceedings of the 2005 ACM workshop on Information retrieval in peer-to-peer networks
Semantic link based top-K join queries in P2P networks
Proceedings of the 15th international conference on World Wide Web
Social networks, incentives, and search
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
ProbFuse: a probabilistic approach to data fusion
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
User modeling for full-text federated search in peer-to-peer networks
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
ALVIS peers: a scalable full-text peer-to-peer retrieval engine
P2PIR '06 Proceedings of the international workshop on Information retrieval in peer-to-peer networks
Developing Multi-Agent Systems with JADE (Wiley Series in Agent Technology)
Developing Multi-Agent Systems with JADE (Wiley Series in Agent Technology)
Full-text federated search in peer-to-peer networks
ACM SIGIR Forum
Federated text retrieval from uncooperative overlapped collections
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Web text retrieval with a P2P query-driven index
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Peer-to-peer similarity search over widely distributed document collections
Proceedings of the 2008 ACM workshop on Large-Scale distributed systems for information retrieval
Dynamicity vs. effectiveness: studying online clustering for scatter/gather
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Server selection methods in personal metasearch: a comparative empirical study
Information Retrieval
The web as a graph: measurements, models, and methods
COCOON'99 Proceedings of the 5th annual international conference on Computing and combinatorics
Modern Information Retrieval
Advanced Metasearch Engine Technology
Advanced Metasearch Engine Technology
Scalability of findability: effective and efficient IR operations in large information networks
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Foundations and Trends in Information Retrieval
Semantic overlay networks for p2p systems
AP2PC'04 Proceedings of the Third international conference on Agents and Peer-to-Peer Computing
A survey and comparison of peer-to-peer overlay network schemes
IEEE Communications Surveys & Tutorials
Hi-index | 0.00 |
With the ubiquitous production, distribution and consumption of information, today's digital environments such as the Web are increasingly large and decentralized. It is hardly possible to obtain central control over information collections and systems in these environments. Searching for information in these information spaces has brought about problems beyond traditional boundaries of information retrieval (IR) research. This article addresses one important aspect of scalability challenges facing information retrieval models and investigates a decentralized, organic view of information systems pertaining to search in large-scale networks. Drawing on observations from earlier studies, we conduct a series of experiments on decentralized searches in large-scale networked information spaces. Results show that how distributed systems interconnect is crucial to retrieval performance and scalability of searching. Particularly, in various experimental settings and retrieval tasks, we find a consistent phenomenon, namely, the Clustering Paradox, in which the level of network clustering (semantic overlay) imposes a scalability limit. Scalable searches are well supported by a specific, balanced level of network clustering emerging from local system interconnectivity. Departure from that level, either stronger or weaker clustering, leads to search performance degradation, which is dramatic in large-scale networks.