Techniques for the measurement of clustering tendency in document retrieval systems
Journal of Information Science
Recent trends in hierarchic document clustering: a critical review
Information Processing and Management: an International Journal
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
OHSUMED: an interactive retrieval evaluation and new large test collection for research
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Performance standards and evaluations in IR test collections: cluster-based retrieval models
Information Processing and Management: an International Journal
Elicitation behavior during mediated information retrieval
Information Processing and Management: an International Journal
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
WebCluster, a tool for mediated information access (demonstration abstract)
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Grouper: a dynamic clustering interface to Web search results
WWW '99 Proceedings of the eighth international conference on World Wide Web
A knowledge-based approach to organizing retrieved documents
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Real life, real users, and real needs: a study and analysis of user queries on the web
Information Processing and Management: an International Journal
ClusterBook, a tool for dual information access (demonstration session)
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Exemplary documents: a foundation for information retrieval design
Information Processing and Management: an International Journal
Dynamic categorization: a method for decreasing information overload
Dynamic categorization: a method for decreasing information overload
Interactive information organization: techniques and evaluation
Interactive information organization: techniques and evaluation
Document clustering for mediated information access
IRSG'99 Proceedings of the 21st Annual BCS-IRSG conference on Information Retrieval Research
Automatic new topic identification using multiple linear regression
Information Processing and Management: an International Journal
IIiX Proceedings of the 1st international conference on Information interaction in context
Proceedings of the 39th conference on Winter simulation: 40 years! The best is yet to come
Visualising the structure of document search results: a comparison of graph theoretic approaches
Information Visualization
Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Integrating interaction design and log analysis: bridging the gap with UML, XML and XMI
Journal of Web Engineering
Hi-index | 0.01 |
Clear and precise queries are a necessity when searching very large document collections, especially when query-based retrieval is the only means of exploration. We propose system-mediated information access as a solution for users' well-documented inability to formulate good queries. Our approach is based on two main assumptions: first, on the ability of document clustering to reveal the topical, semantic structure of a problem domain represented by a specialized "source collection," and, second, on the capacity of statistical language models to convey content. Taking the role of the human mediator or intermediary searcher, a mediation system interacts with the user and supports her exploration of a relatively small source collection, chosen to be representative for the problem domain. Based on the user's selection of relevant "exemplary" documents and clusters from this source collection, the system builds a language model of her information need. This model is subsequently used to derive "mediated queries," which are expected to convey precisely and comprehensively the user's information need, and can be submitted by the user to search any large and heterogeneous "target collections." We present results of experiments that simulated various mediation strategies and compared the effect on mediation effectiveness of a variety of parameters, such as the similarity measure, the weighting scheme, and the clustering method. They provide both upperbounds of performance that can potentially be reached by real end users and a comparison between the effectiveness of these strategies. The experimental evidence suggests that information retrieval mediated through a clustered specialized collection has potential to improve effectiveness significantly.