Background knowledge integration in clustering using purity indexes

  • Authors:
  • Germain Forestier;Cédric Wemmert;Pierre Gançarski

  • Affiliations:
  • Image Sciences, Computer Sciences and Remote Sensing Laboratory, University of Strasbourg, France;Image Sciences, Computer Sciences and Remote Sensing Laboratory, University of Strasbourg, France;Image Sciences, Computer Sciences and Remote Sensing Laboratory, University of Strasbourg, France

  • Venue:
  • KSEM'10 Proceedings of the 4th international conference on Knowledge science, engineering and management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In recent years, the use of background knowledge to improve the data mining process has been intensively studied. Indeed, background knowledge along with knowledge directly or indirectly provided by the user are often available. However, it is often difficult to formalize this kind of knowledge, as it is often dependent of the domain. In this article, we studied the integration of knowledge as labeled objects in clustering algorithms. Several criteria allowing the evaluation of the purity of a clustering are presented and their behaviours are compared using artificial datasets. Advantages and drawbacks of each criterion are analyzed in order to help the user to make a choice among them.