FacetCube: a framework of incorporating prior knowledge into non-negative tensor factorization

  • Authors:
  • Yun Chi;Shenghuo Zhu

  • Affiliations:
  • NEC Laboratories America, Inc., Cupertino, CA, USA;NEC Laboratories America, Inc., Cupertino, CA, USA

  • Venue:
  • CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Non-negative tensor factorization (NTF) is a relatively new technique that has been successfully used to extract significant characteristics from polyadic data, such as data in social networks. Because these polyadic data have multiple dimensions (e.g., the author, content, and timestamp of a blog post), NTF fits in naturally and extracts data characteristics jointly from different data dimensions. In the standard NTF, all information comes from the observed data and end users have no control over the outcomes. However, in many applications very often the end users have certain prior knowledge, such as the demographic information about individuals in a social network or a pre-constructed ontology on the contents, and therefore prefer the extracted data characteristics being consistent with such prior knowledge. To allow users' prior knowledge to be naturally incorporated into NTF, in this paper we present a novel framework - FacetCube - that extends the standard non-negative tensor factorization. The new framework allows the end users to control the factorization outputs at three different levels for each of the data dimensions. The proposed framework is intuitively appealing in that it has a close connection to the probabilistic generative models. In addition to introducing the framework, we provide an iterative algorithm for computing the optimal solution to the framework. We also develop an efficient implementation of the algorithm that consists of a series of techniques to make our framework scalable to large data sets. Extensive experimental studies on a paper citation data set and a blog data set demonstrate that our new framework is able to effectively incorporate users' prior knowledge, improves performance over the standard NTF on the task of personalized recommendation, and is scalable to large data sets from real-life applications.