Searching documents based on relevance and type

Authors:
Jun Xu;Yunbo Cao;Hang Li;Nick Craswell;Yalou Huang
Affiliations:
Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;Microsoft Research Cambridge, UK;Nankai University, Tianjin, China
Venue:
ECIR'07 Proceedings of the 29th European conference on IR research
Year:
2007

Citing 8
Cited 2

Relevance: the whole history

Journal of the American Society for Information Science - Special topic issue on the history of documentation and information science: part II
Task-oriented world wide web retrieval by document type classification

Proceedings of the eighth international conference on Information and knowledge management
Integrating automatic genre analysis into digital libraries

Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Effective site finding using link anchor information

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
The Importance of Prior Probabilities for Entry Page Search

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Automatic detection of text genre

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Modeling task-genre relationships for IR in the workplace

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval

Automatic genre identification: towards a flexible classification scheme

FDIA'07 Proceedings of the 1st BCS IRSG conference on Future Directions in Information Access
A cross-domain analysis of task and genre effects on perceptions of usefulness

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper extends previous work on document retrieval and document type classification, addressing the problem of 'typed search'. Specifically, given a query and a designated document type, the search system retrieves and ranks documents not only based on the relevance to the query, but also based on the likelihood of being the designated document type. The paper formalizes the problem in a general framework consisting of 'relevance model' and 'type model'. The relevance model indicates whether or not a document is relevant to a query. The type model indicates whether or not a document belongs to the designated document type. We consider three methods for combing the models: linear combination of scores, thresholding on the type score, and a hybrid of the previous two methods. We take course page search and instruction document search as examples and have conducted a series of experiments. Experimental results show our proposed approaches can significantly outperform the baseline methods.