Feature selection for link prediction

  • Authors:
  • Ye Xu;Dan Rockmore

  • Affiliations:
  • Dartmouth College, Hanover, NH, USA;Dartmouth College, Hanover, NH, USA

  • Venue:
  • Proceedings of the 5th Ph.D. workshop on Information and knowledge
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Networks that model relationships in the real world have attracted much attention in the past few years. Link prediction plays a central role in the network area. Supervised learning is an important class of algorithms used to address the link prediction problem. A big challenge in solving link prediction tasks is deciding how to choose relevant features. As an important machine learning technique to select relevant features, feature selection not only enhances classification accuracy, but also improves the efficiency of the training process. Thus, it is especially relevant for link prediction. However, to the best of our knowledge, feature selection under the link prediction scenario remains unstudied. In this paper, we propose FEature Selection for Link Prediction (FESLP), which contains a feature ranking algorithm and a feature weighting algorithm to address link prediction tasks. We measure the discriminative ability of each individual feature and select those features with greatest discriminative power. Simultaneously, we aim to minimize the correlations among features such that redundancy in the learned feature space is as small as possible. Thus, the feature space can accurately preserve the sketch of the original data. Feature weighting and feature ranking problems can be formalized as two quadratic optimization problems. The active set method is used to solve the feature weighting problem (via real number programming) while a greedy policy is applied to solve the feature ranking problem (via integer programming). In experiments, We evaluate FESLP on six large-scale email network datasets from a university. The results show the effectiveness of the FESLP.