Meta methods for model sharing in personal information systems

  • Authors:
  • Stefan Siersdorfer;Sergej Sizov

  • Affiliations:
  • University of Sheffield, UK;University of Koblenz-Landau, Germany

  • Venue:
  • ACM Transactions on Information Systems (TOIS)
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This article introduces a methodology for automatically organizing document collections into thematic categories for Personal Information Management (PIM) through collaborative sharing of machine learning models in an efficient and privacy-preserving way. Our objective is to combine multiple independently learned models from several users to construct an advanced ensemble-based decision model by taking the knowledge of multiple users into account in a decentralized manner, for example, in a peer-to-peer overlay network. High accuracy of the corresponding supervised (classification) and unsupervised (clustering) methods is achieved by restrictively leaving out uncertain documents rather than assigning them to inappropriate topics or clusters with low confidence. We introduce a formal probabilistic model for the resulting ensemble based meta methods and explain how it can be used for constructing estimators and for goal-oriented tuning. Comprehensive evaluation results on different reference data sets illustrate the viability of our approach.