A cross-domain adaptation method for sentiment classification using probabilistic latent analysis

  • Authors:
  • Sheng Gao;Haizhou Li

  • Affiliations:
  • Institute for Infocomm Research, A*STAR, Singapore, Singapore;Institute for Infocomm Research, A*STAR, Singapore, Singapore

  • Venue:
  • Proceedings of the 20th ACM international conference on Information and knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sentiment classification is becoming attractive in recent years because of its potential commercial applications. It exploits supervised learning methods to learn the classifiers from the annotated training documents. The challenge in sentiment classification lies in that the sentiment domains are diverse, heterogeneous and fast-growing. The classifiers trained on one domain (source domain) could not classify a document from another domain (target domain). The domain adaptation technique is to address the problem by making use of labeled samples in the source domain, and unlabeled samples in the target domain. This paper presents a new solution, a cross-domain topic indexing (CDTI) method, with which a common semantic space is found from the prior between-domain term correspondences and the term co-occurrences in the cross-domain documents. These observations are characterized with the mixture model in CDTI, with each component being a possible topic shared by the source and target domains. Such common topics are found to index the cross-domain content. We evaluate the algorithms on a multi-domain sentiment classification task, which shows that CDTI outperforms the state-of-the-art domain adaptation method, i.e. spectral feature alignment (SFA), and the traditional latent semantic indexing method.