Passage-level evidence in document retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The use of MMR, diversity-based reranking for reordering documents and producing summaries
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Effective ranking with arbitrary passages
Journal of the American Society for Information Science and Technology
Exploiting redundancy in question answering
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Topic analysis using a finite mixture model
Information Processing and Management: an International Journal
Advances in domain independent linear text segmentation
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
A statistical model for domain-independent text segmentation
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Automatic evaluation of summaries using N-gram co-occurrence statistics
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Bayesian query-focused summarization
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Topic-focused multi-document summarization using an approximate oracle score
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Wikify!: linking documents to encyclopedic knowledge
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Extracting Relevant Snippets fromWeb Documents through Language Model based Text Segmentation
WI '07 Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence
Multi-document summarization using cluster-based link analysis
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Statistical Language Models for Information Retrieval
Statistical Language Models for Information Retrieval
Proceedings of the Second ACM International Conference on Web Search and Data Mining
An axiomatic approach for result diversification
Proceedings of the 18th international conference on World wide web
Exploring content models for multi-document summarization
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Text segmentation via topic modeling: an analytical study
Proceedings of the 18th ACM conference on Information and knowledge management
Using topic themes for multi-document summarization
ACM Transactions on Information Systems (TOIS)
Generating templates of entity summaries with an entity-aspect model and pattern mining
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
ACM SIGMOD Record
Statistical source expansion for question answering
Proceedings of the 20th ACM international conference on Information and knowledge management
Data mining for improving textbooks
ACM SIGKDD Explorations Newsletter
Improving retrieval of short texts through document expansion
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Supporting factual statements with evidence from the web
Proceedings of the 21st ACM international conference on Information and knowledge management
Discovering emerging entities with ambiguous names
Proceedings of the 23rd international conference on World wide web
Hi-index | 0.00 |
Knowledge bases about entities have become a vital asset for Web search, recommendations, and analytics. Examples are Freebase being the core of the Google Knowledge Graph and the use of Wikipedia for distant supervision in numerous IR and NLP tasks. However, maintaining the knowledge about not so prominent entities in the long tail is often a bottleneck as human contributors face the tedious task of continuously identifying and reading relevant sources. To overcome this limitation and accelerate the maintenance of knowledge bases, we propose an approach that automatically extracts, from the Web, key contents for given input entities. Our method, called GEM, generates salient contents about a given entity, using minimal assumptions about the underlying sources, while meeting the constraint that the user is willing to read only a certain amount of information. Salient content pieces have variable length and are computed using a budget-constrained optimization problem which decides upon which sub-pieces of an input text should be selected for the final result. GEM can be applied to a variety of knowledge-gathering settings including news streams and speech input from videos. Our experimental studies show the viability of the approach, and demonstrate improvements over various baselines, in terms of precision and recall.