The effectiveness of GIOSS for the text database discovery problem
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
HyPursuit: a hierarchical network search engine that exploits content-link hypertext clustering
Proceedings of the the seventh ACM conference on Hypertext
GlOSS: text-source discovery over the Internet
ACM Transactions on Database Systems (TODS)
Boolean Similarity Measures for Resource Discovery
IEEE Transactions on Knowledge and Data Engineering
The MyVIEW Project: A Data Warehousing Approach to Personalized Digital Libraries
NGIT '99 Proceedings of the 4th International Workshop on Next Generation Information Technologies and Systems
Query-driven document partitioning and collection selection
InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
Mining Query Logs: Turning Search Usage Data into Knowledge
Foundations and Trends in Information Retrieval
Hi-index | 0.00 |
The popularity of information retrieval has led users to a new problem: finding which text databases (out of thousands of candidate choices) are the most relevant to a user. Answering a given query with a list of relevant databases is the text database discovery problem. The first part of this paper presents a practical method for attacking this problem based on estimating the result size of a query and a database. The method is termed GlOSS--Glossary of Servers Server. The second part of this paper evaluates GlOSS using four different semantics to answer a user''s queries. Real users'' queries were used in the experiments. We also describe several variations of GlOSS and compare their efficacy. In addition, we analyze the storage cost of our approach to the problem.