NetSerf: using semantic knowledge to find Internet information archives
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Searching distributed collections with inference networks
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Learning collection fusion strategies
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Effective retrieval with distributed collections
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Methods for information server selection
ACM Transactions on Information Systems (TOIS)
Inquirus, the NECI meta search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Comparing the performance of database selection algorithms
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Cluster-based language models for distributed retrieval
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
GlOSS: text-source discovery over the Internet
ACM Transactions on Database Systems (TODS)
Server selection on the World Wide Web
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Experimentation as a way of life: Okapi at TREC
Information Processing and Management: an International Journal - The sixth text REtrieval conference (TREC-6)
The impact of database selection on distributed searching
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Database merging strategy based on logistic regression
Information Processing and Management: an International Journal
Collection selection and results merging with topically organized U.S. patents and TREC data
Proceedings of the ninth international conference on Information and knowledge management
Implementation of the SMART Information Retrieval System
Implementation of the SMART Information Retrieval System
Towards comprehensive web search
Towards comprehensive web search
Automated discovery of search interfaces on the web
ADC '03 Proceedings of the 14th Australasian database conference - Volume 17
Result merging strategies for a current news metasearcher
Information Processing and Management: an International Journal
Information Retrieval with Distributed Databases: Analytic Models of Performance
IEEE Transactions on Parallel and Distributed Systems
Server selection methods in hybrid portal search
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation by comparing result sets in context
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Information redundancy across metadata collections
Information Processing and Management: an International Journal
Recommenders in a personalized, collaborative digital library environment
Journal of Intelligent Information Systems
Distributed text retrieval from overlapping collections
ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
Probability-based fusion of information retrieval result sets
Artificial Intelligence Review
Enhancing web search by promoting multiple search engine use
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Contextualized query sampling to discover semantic resource descriptions on the web
Information Processing and Management: an International Journal
Robust result merging using sample-based score estimates
ACM Transactions on Information Systems (TOIS)
Exploiting Parallelism to Accelerate Keyword Search on Deep-Web Sources
DILS '09 Proceedings of the 6th International Workshop on Data Integration in the Life Sciences
Server selection methods in personal metasearch: a comparative empirical study
Information Retrieval
An evolutionary approach to query-sampling for heterogeneous systems
Expert Systems with Applications: An International Journal
Term proximity scoring for keyword-based retrieval systems
ECIR'03 Proceedings of the 25th European conference on IR research
Collection profiling for collection fusion in distributed information retrieval systems
KSEM'07 Proceedings of the 2nd international conference on Knowledge science, engineering and management
Factors affecting click-through behavior in aggregated search interfaces
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Research proposal for distributed deep web search
PIKM '10 Proceedings of the 3rd workshop on Ph.D. students in information and knowledge management
DOCODE-lite: a meta-search engine for document similarity retrieval
KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part II
Foundations and Trends in Information Retrieval
Evolutionary approach for semantic-based query sampling in large-scale information sources
Information Sciences: an International Journal
TOP-k query calculation in peer-to-peer networks
ASIAN'05 Proceedings of the 10th Asian Computing Science conference on Advances in computer science: data management on the web
To what problem is distributed information retrieval the solution?
Journal of the American Society for Information Science and Technology
Mixture model with multiple centralized retrieval algorithms for result merging in federated search
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Merging algorithms for enterprise search
Proceedings of the 18th Australasian Document Computing Symposium
Hi-index | 0.00 |
We have investigated two major issues in Distributed Information Retrieval (DIR), namely: collection selection and search results merging. While most published works on these two issues are based on pre-stored metadata, the approaches described in this paper involve extracting the required information at the time the query is processed. In order to predict the relevance of collections to a given query, we analyse a limited number of full documents (e.g., the top five documents) retrieved from each collection and then consider term proximity within them. On the other hand, our merging technique is rather simple since input only requires document scores and lengths of results lists. Our experiments evaluate the retrieval effectiveness of these approaches and compare them with centralised indexing and various other DIR techniques (e.g., CORI). We conducted our experiments using two testbeds: one containing news articles extracted from four different sources (2 GB) and another containing 10 GB of Web pages. Our evaluations demonstrate that the retrieval effectiveness of our simple approaches is worth considering.