Constructing query-biased summaries: a comparison of human and system generated snippets

Authors:
Lorena Leal Bando;Falk Scholer;Andrew Turpin
Affiliations:
RMIT University, Melbourne, Australia;RMIT University, Melbourne, Australia;The University of Melbourne, Melbourne, Australia
Venue:
Proceedings of the third symposium on Information interaction in context
Year:
2010

Citing 11
Cited 2

Advantages of query biased summaries in information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
New Methods in Automatic Extracting

Journal of the ACM (JACM)
Query association for effective retrieval

Proceedings of the eleventh international conference on Information and knowledge management
A task-oriented study on the influencing effects of query-biased summarisation in web searching

Information Processing and Management: an International Journal
Fast generation of abstracts from general domain text corpora by extracting relevant sentences

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Fast generation of result snippets in web search

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Emulating query-biased summaries using document titles

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Document Compaction for Efficient Query Biased Snippet Generation

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
The automatic creation of literature abstracts

IBM Journal of Research and Development

Search snippet evaluation at yandex: lessons learned and future directions

CLEF'11 Proceedings of the Second international conference on Multilingual and multimodal information access evaluation
Summarizing highly structured documents for effective search interaction

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern search engines display a summary for each ranked document that is returned in response to a query. These summaries typically include a snippet -- a collection of text fragments from the underlying document -- that has some relation to the query that is being answered. In this study we investigate how 10 humans construct snippets: participants first generate their own natural language snippet, and then separately extract a snippet by choosing text fragments, for four queries related to two documents. By mapping their generated snippets back to text fragments in the source document using eye tracking data, we observe that participants extract these same pieces of text around 73% of the time when creating their extractive snippets. In comparison, we notice that automated approaches for extracting snippets only use these same fragments 10% of the time. However, when the automated methods are evaluated using a position-independent bag-of-words approach, as typically used in the research literature for evaluating snippets, they are scored much more highly, seemingly extracting the "correct" text 24% of the time. In addition to demonstrating this large scope for improvement in snippet generation algorithms with our novel methodology, we also offer a series of observations on the behaviour of participants as they constructed their snippets.