A collaborative filtering-based approach to personalized document clustering

  • Authors:
  • Chih-Ping Wei;Chin-Sheng Yang;Han-Wei Hsiao

  • Affiliations:
  • Institute of Technology Management, College of Technology Management, National Tsing Hua University, Hsinchu, Taiwan, ROC;Department of Information Management, College of Management, National Sun Yat-sen University, Kaohsiung, Taiwan, ROC;Department of Information Management, College of Economics and Management, National University of Kaohsiung, Kaohsiung, Taiwan, ROC

  • Venue:
  • Decision Support Systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Document clustering is an intentional act that reflects individual preferences with regard to the semantic coherency and relevant categorization of documents. Hence, effective document clustering must consider individual preferences and needs to support personalization in document categorization. Most existing document-clustering techniques, generally anchoring in pure content-based analysis, generate a single set of clusters for all individuals without tailoring to individuals' preferences and thus are unable to support personalization. The partial-clustering-based personalized document-clustering approach, incorporating a target individual's partial clustering into the document-clustering process, has been proposed to facilitate personalized document clustering. However, given a collection of documents to be clustered, the individual might have categorized only a small subset of the collection into his or her personal folders. In this case, the small partial clustering would degrade the effectiveness of the existing personalized document-clustering approach for this particular individual. In response, we extend this approach and propose the collaborative-filtering-based personalized document-clustering (CFC) technique that expands the size of an individual's partial clustering by considering those of other users with similar categorization preferences. Our empirical evaluation results suggest that when given a small-sized partial clustering established by an individual, the proposed CFC technique generally achieves better clustering effectiveness for the individual than does the partial-clustering-based personalized document-clustering technique.