The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
Personalizing search via automated analysis of interests and activities
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Time-Decaying Bloom Filters for Data Streams with Skewed Distributions
RIDE '05 Proceedings of the 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications
InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
Mining long-term search history to improve search accuracy
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Emerging semantic communities in peer web search
P2PIR '06 Proceedings of the international workshop on Information retrieval in peer-to-peer networks
ALVIS peers: a scalable full-text peer-to-peer retrieval engine
P2PIR '06 Proceedings of the international workshop on Information retrieval in peer-to-peer networks
Optimizing web search using social annotations
Proceedings of the 16th international conference on World Wide Web
Beehive: O(1)lookup performance for power-law query distributions in peer-to-peer overlays
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Beehive: O(1)lookup performance for power-law query distributions in peer-to-peer overlays
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Efficient top-k querying over social-tagging networks
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Dynamic Load Sharing in Peer-to-Peer Systems: When Some Peers Are More Equal than Others
IEEE Internet Computing
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
Aggregation of a term vocabulary for P2P-IRtest: a DHT stress test
DBISP2P'05/06 Proceedings of the 2005/2006 international conference on Databases, information systems, and peer-to-peer computing
Hi-index | 0.00 |
Popular search engines essentially rely on information about the structure of the graph of linked elements to find the most relevant results for a given query. While this approach is satisfactory for popular interest domains or when the user expectations follow the main trend, it is very sensitive to the case of ambiguous queries, where queries can have answers over several different domains. Elements pertaining to an implicitly targeted interest domain with low popularity are usually ranked lower than expected by the user. This is a consequence of the poor usage of user-centric information in search engines. Leveraging semantic information can help avoid such situations by proposing complementary results that are carefully tailored to match user interests. This paper proposes a collaborative search companion system, CoFeed, that collects user search queries and accesses feedback to build user- and document-centric profiling information. Over time, the system constructs ranked collections of elements that maintain the required information diversity and enhance the user search experience by presenting additional results tailored to the user interest space. This collaborative search companion requires a supporting architecture adapted to large user populations generating high request loads. To that end, it integrates mechanisms for ensuring scalability and load balancing of the service under varying loads and user interest distributions. Experiments with a deployed prototype highlight the efficiency of the system by analyzing improvement in search relevance, computational cost, scalability and load balance.