Validation of overlapping clustering: A random clustering perspective

  • Authors:
  • Junjie Wu;Hua Yuan;Hui Xiong;Guoqing Chen

  • Affiliations:
  • School of Economics and Management, Beihang University, Beijing 100191, China;School of Management and Economics, University of Electronic Science and Technology of China, Chengdu 610054, China;Management Science and Information Systems Department, Rutgers University, Newark 07102, NJ, USA;School of Economics and Management, Tsinghua University, Beijing 100084, China

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2010

Quantified Score

Hi-index 0.07

Visualization

Abstract

As a widely used clustering validation measure, the F-measure has received increased attention in the field of information retrieval. In this paper, we reveal that the F-measure can lead to biased views as to results of overlapped clusters when it is used for validating the data with different cluster numbers (incremental effect) or different prior probabilities of relevant documents (prior-probability effect). We propose a new ''IMplication Intensity'' (IMI) measure which is based on the F-measure and is developed from a random clustering perspective. In addition, we carefully investigate the properties of IMI. Finally, experimental results on real-world data sets show that IMI significantly alleviates biased incremental and prior-probability effects which are inherent to the F-measure.