Journal of the American Society for Information Science - Special topic issue on the history of documentation and information science: part II
Task-oriented world wide web retrieval by document type classification
Proceedings of the eighth international conference on Information and knowledge management
Integrating automatic genre analysis into digital libraries
Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Effective site finding using link anchor information
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
The Importance of Prior Probabilities for Entry Page Search
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Automatic detection of text genre
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Modeling task-genre relationships for IR in the workplace
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic genre identification: towards a flexible classification scheme
FDIA'07 Proceedings of the 1st BCS IRSG conference on Future Directions in Information Access
A cross-domain analysis of task and genre effects on perceptions of usefulness
Information Processing and Management: an International Journal
Hi-index | 0.00 |
This paper extends previous work on document retrieval and document type classification, addressing the problem of 'typed search'. Specifically, given a query and a designated document type, the search system retrieves and ranks documents not only based on the relevance to the query, but also based on the likelihood of being the designated document type. The paper formalizes the problem in a general framework consisting of 'relevance model' and 'type model'. The relevance model indicates whether or not a document is relevant to a query. The type model indicates whether or not a document belongs to the designated document type. We consider three methods for combing the models: linear combination of scores, thresholding on the type score, and a hybrid of the previous two methods. We take course page search and instruction document search as examples and have conducted a series of experiments. Experimental results show our proposed approaches can significantly outperform the baseline methods.