Gem-based entity-knowledge maintenance

Authors:
Bilyana Taneva;Gerhard Weikum
Affiliations:
Max-Planck Institute for Informatics, Saarbrücken, Germany;Max-Planck Institute for Informatics, Saarbrücken, Germany
Venue:
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Year:
2013

Citing 25
Cited 1

Passage-level evidence in document retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Effective ranking with arbitrary passages

Journal of the American Society for Information Science and Technology
Exploiting redundancy in question answering

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Topic analysis using a finite mixture model

Information Processing and Management: an International Journal
Advances in domain independent linear text segmentation

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
A statistical model for domain-independent text segmentation

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Bayesian query-focused summarization

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Topic-focused multi-document summarization using an approximate oracle score

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Wikify!: linking documents to encyclopedic knowledge

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Extracting Relevant Snippets fromWeb Documents through Language Model based Text Segmentation

WI '07 Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence
Multi-document summarization using cluster-based link analysis

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Statistical Language Models for Information Retrieval

Statistical Language Models for Information Retrieval
Diversifying search results

Proceedings of the Second ACM International Conference on Web Search and Data Mining
An axiomatic approach for result diversification

Proceedings of the 18th international conference on World wide web
Exploring content models for multi-document summarization

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Text segmentation via topic modeling: an analytical study

Proceedings of the 18th ACM conference on Information and knowledge management
Using topic themes for multi-document summarization

ACM Transactions on Information Systems (TOIS)
Generating templates of entity summaries with an entity-aspect model and pattern mining

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Search result diversification

ACM SIGMOD Record
Statistical source expansion for question answering

Proceedings of the 20th ACM international conference on Information and knowledge management
Data mining for improving textbooks

ACM SIGKDD Explorations Newsletter
Improving retrieval of short texts through document expansion

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Supporting factual statements with evidence from the web

Proceedings of the 21st ACM international conference on Information and knowledge management

Discovering emerging entities with ambiguous names

Proceedings of the 23rd international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Knowledge bases about entities have become a vital asset for Web search, recommendations, and analytics. Examples are Freebase being the core of the Google Knowledge Graph and the use of Wikipedia for distant supervision in numerous IR and NLP tasks. However, maintaining the knowledge about not so prominent entities in the long tail is often a bottleneck as human contributors face the tedious task of continuously identifying and reading relevant sources. To overcome this limitation and accelerate the maintenance of knowledge bases, we propose an approach that automatically extracts, from the Web, key contents for given input entities. Our method, called GEM, generates salient contents about a given entity, using minimal assumptions about the underlying sources, while meeting the constraint that the user is willing to read only a certain amount of information. Salient content pieces have variable length and are computed using a budget-constrained optimization problem which decides upon which sub-pieces of an input text should be selected for the final result. GEM can be applied to a variety of knowledge-gathering settings including news streams and speech input from videos. Our experimental studies show the viability of the approach, and demonstrate improvements over various baselines, in terms of precision and recall.