Retaining knowledge for document management: Category-tree integration by exploiting category relationships and hierarchical structures

  • Authors:
  • Christopher C. Yang;Jianfeng Lin;Chih-Ping Wei

  • Affiliations:
  • College of Information Science and Technology, Drexel University, 3141 Chestnut Street, Philadelphia, PA 19104;Digital Library Laboratory, Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR;Department of Information Management, College of Management, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei, 10617, Taiwan

  • Venue:
  • Journal of the American Society for Information Science and Technology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The category-tree document-classification structure is widely used by enterprises and information providers to organize, archive, and access documents for effective knowledge management. However, category trees from various sources use different hierarchical structures, which usually make mappings between categories in different category trees difficult. In this work, we propose a category-tree integration technique. We develop a method to learn the relationships between any two categories and develop operations such as mapping, splitting, and insertion for this integration. According to the parent-child relationship of the integrating categories, the developed decision rules use integration operations to integrate categories from the source category tree with those from the master category tree. A unified category tree can accumulate knowledge from multiple resources without forfeiting the knowledge in individual category trees. Experiments have been conducted to measure the performance of the integration operations and the accuracy of the integrated category trees. The proposed category-tree integration technique achieves greater than 80% integration accuracy, and the insert operation is the most frequently utilized, followed by map and split. The insert operation achieves 77% of F1 while the map and split operations achieves 86% and 29% of F1, respectively. © 2010 Wiley Periodicals, Inc.