Finding the optimal cardinality value for information bottleneck method

  • Authors:
  • Gang Li;Dong Liu;Yiqing Tu;Yangdong Ye

  • Affiliations:
  • School of Information Technology, Deakin University, Vic, Australia;School of Information Engineering, Zhengzhou University, Zhengzhou, China;School of Information Technology, Deakin University, Vic, Australia;School of Information Engineering, Zhengzhou University, Zhengzhou, China

  • Venue:
  • ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Information Bottleneck method can be used as a dimensionality reduction approach by grouping “similar” features together [1]. In application, a natural question is how many “features groups” will be appropriate. The dependency on prior knowledge restricts the applications of many Information Bottleneck algorithms. In this paper we alleviate this dependency by formulating the parameter determination as a model selection problem, and solve it using the minimum message length principle. An efficient encoding scheme is designed to describe the information bottleneck solutions and the original data, then the minimum message length principle is incorporated to automatically determine the optimal cardinality value. Empirical results in the documentation clustering scenario indicates that the proposed method works well for the determination of the optimal parameter value for information bottleneck method.