MagicCube: choosing the best snippet for each aspect of an entity

Authors:
Yexin Wang;Li Zhao;Yan Zhang
Affiliations:
Peking University, Beijing, China;Peking University, Beijing, China;Peking University, Beijing, China
Venue:
Proceedings of the 18th ACM conference on Information and knowledge management
Year:
2009

Citing 8
Cited 1

Summarizing text documents: sentence selection and evaluation metrics

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
OCELOT: a system for summarizing Web pages

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Statistics-Based Summarization - Step One: Sentence Compression

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Latent dirichlet allocation

The Journal of Machine Learning Research
Web-scale information extraction in knowitall: (preliminary results)

Proceedings of the 13th international conference on World Wide Web
KnowItNow: fast, scalable information extraction from the web

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
CollabSum: exploiting multiple document clustering for collaborative single document summarizations

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Manifold-ranking based topic-focused multi-document summarization

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence

Enriching the contents of enterprises' wiki systems with web information

WAIM'10 Proceedings of the 2010 international conference on Web-age information management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Wikis are currently used in business to provide knowledge management systems, especially for individual organizations. However, building wikis manually is a laborious and time-consuming work. To assist founding wikis, we propose a methodology in this paper to automatically select the best snippets for entities as their initial explanations. Our method consists of two steps. First, we focus on extracting snippets from a given set of web pages for each entity. Starting from a seed sentence, a snippet grows up by adding the most relevant neighboring sentences into itself. The sentences are chosen by the Snippet Growth Model, which employs a distance function and an influence function to make decisions. Secondly, we pick out the best snippet for each aspect of an entity. The combination of all the selected snippets serves as the primary description of the entity. We present three ever-increasing methods to handle selection process. Experimental results based on a real data set show that our proposed method works effectively in producing primary descriptions for entities such as employee names.