Generating synopses for document-element search

Authors:
Sumit Bhatia;Shibamouli Lahiri;Prasenjit Mitra
Affiliations:
The Pennsylvania State University, University Park, PA, USA;The Pennsylvania State University, University Park, PA, USA;The Pennsylvania State University, University Park, PA, USA
Venue:
Proceedings of the 18th ACM conference on Information and knowledge management
Year:
2009

Citing 6
Cited 9

A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Learning with progressive transductive support vector machine

Pattern Recognition Letters
Associating Text and Graphics for Scientific Chart Understanding

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
TableSeer: automatic table metadata extraction and searching in digital libraries

Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
BioText Search Engine

Bioinformatics
Automatic extraction of data points and text blocks from 2-dimensional plots in digital documents

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2

Finding algorithms in scientific articles

Proceedings of the 19th international conference on World wide web
An algorithm search engine for software developers

Proceedings of the 3rd International Workshop on Search-Driven Development: Users, Infrastructure, Tools, and Evaluation
Towards a framework for abstractive summarization of multimodal documents

HLT-SS '11 Proceedings of the ACL 2011 Student Session
Abstractive summarization of line graphs from popular media

WASDGML '11 Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages
Summarizing figures, tables, and algorithms in scientific publications to augment search results

ACM Transactions on Information Systems (TOIS)
Summarizing multimodal documents in popular media for people with visual impairments

ACM SIGACCESS Accessibility and Computing
Improving the accessibility of line graphs in multimodal documents

SLPAT '11 Proceedings of the Second Workshop on Speech and Language Processing for Assistive Technologies
Improving algorithm search using the algorithm co-citation network

Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
Extraction of relevant figures and tables for multi-document summarization

CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Scientists often search for document-elements like tables, figures, or algorithm pseudo-codes. Domain scientists and researchers report important data, results and algorithms using these document-elements; readers want to compare the reported results with their findings. Some document-element search engines have been proposed (especially to search for tables and figures) to make this task easier. While searching for document-elements today, the end-user is presented with the caption of the document-element and a sentence in the document text that refers to the document-element. Oftentimes, the caption and the reference text do not contain enough information to interpret the document-element. In this paper, we present the first set of methods to extract this useful information (synopsis) related to document-elements automatically. We also investigate the problem of choosing the optimum synopsis-size that strikes a balance between information content and size of the generated synopses.