Enriching the contents of enterprises' wiki systems with web information

  • Authors:
  • Li Zhao;Yexin Wang;Congrui Huang;Yan Zhang

  • Affiliations:
  • Department of Machine Intelligence, Peking University, Beijing, China and Key Laboratory on Machine Perception, Ministry of Education, Beijing, China;Department of Machine Intelligence, Peking University, Beijing, China and Key Laboratory on Machine Perception, Ministry of Education, Beijing, China;Department of Machine Intelligence, Peking University, Beijing, China and Key Laboratory on Machine Perception, Ministry of Education, Beijing, China;Department of Machine Intelligence, Peking University, Beijing, China and Key Laboratory on Machine Perception, Ministry of Education, Beijing, China

  • Venue:
  • WAIM'10 Proceedings of the 2010 international conference on Web-age information management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Wikis are currently used in providing knowledge management systems for individual enterprises. The initial explanations of word entries (entities) in such a system can be generated from the pages on the Intranet of an enterprise. However, the information on such internal pages cannot cover all aspects of the entities. To solve this problem, this paper tries to enrich the explanations of entities by exploiting Web pages on the Internet. This task consists of three steps. First, it obtains pages from the Internet for each entity as an initial page set with the help of search engines. Secondly, it locates the pages which have a high correlation with the entity from the page set. At last, it produces new snippets from such pages and chooses those which can enhance the explanation and throw away the redundant ones. Each candidate snippet is evaluated by two aspects: the correlation between it and the entity, and its ability to enhance the existing explanation. The experimental results based on a real data set show that our proposed method works effectively in supplementing the existing explanation by exploiting web pages from outside the enterprise.