Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Summarizing text documents: sentence selection and evaluation metrics
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
OCELOT: a system for summarizing Web pages
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Statistics-Based Summarization - Step One: Sentence Compression
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
CollabSum: exploiting multiple document clustering for collaborative single document summarizations
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Update summarization based on novel topic distribution
Proceedings of the 9th ACM symposium on Document engineering
Manifold-ranking based topic-focused multi-document summarization
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
MagicCube: choosing the best snippet for each aspect of an entity
Proceedings of the 18th ACM conference on Information and knowledge management
Hi-index | 0.00 |
Wikis are currently used in providing knowledge management systems for individual enterprises. The initial explanations of word entries (entities) in such a system can be generated from the pages on the Intranet of an enterprise. However, the information on such internal pages cannot cover all aspects of the entities. To solve this problem, this paper tries to enrich the explanations of entities by exploiting Web pages on the Internet. This task consists of three steps. First, it obtains pages from the Internet for each entity as an initial page set with the help of search engines. Secondly, it locates the pages which have a high correlation with the entity from the page set. At last, it produces new snippets from such pages and chooses those which can enhance the explanation and throw away the redundant ones. Each candidate snippet is evaluated by two aspects: the correlation between it and the entity, and its ability to enhance the existing explanation. The experimental results based on a real data set show that our proposed method works effectively in supplementing the existing explanation by exploiting web pages from outside the enterprise.