Feature selection for link prediction

Authors:
Ye Xu;Dan Rockmore
Affiliations:
Dartmouth College, Hanover, NH, USA;Dartmouth College, Hanover, NH, USA
Venue:
Proceedings of the 5th Ph.D. workshop on Information and knowledge
Year:
2012

Citing 27
Cited 1

Practical methods of optimization; (2nd ed.)

Practical methods of optimization; (2nd ed.)
Using collaborative filtering to weave an information tapestry

Communications of the ACM - Special issue on information filtering
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
SimRank: a measure of structural-context similarity

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Scaling personalized web search

WWW '03 Proceedings of the 12th international conference on World Wide Web
Use of the zero norm with linear models and kernel methods

The Journal of Machine Learning Research
The link prediction problem for social networks

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Feature selection with conditional mutual information maximin in text categorization

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Efficient Feature Selection via Analysis of Relevance and Redundancy

The Journal of Machine Learning Research
Algorithm Design

Algorithm Design
Boosting the Feature Space: Text Classification for Unstructured Data on the Web

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Neural Networks for Applied Sciences and Engineering

Neural Networks for Applied Sciences and Engineering
Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications

IEEE Transactions on Pattern Analysis and Machine Intelligence
Feature selection for ranking

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Valid inequalities for mixed integer linear programs

Mathematical Programming: Series A and B
Asymmetric distance estimation with sketches for similarity search in high-dimensional spaces

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A family of dissimilarity measures between nodes generalizing both the shortest-path and the commute-time distances

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Local Probabilistic Models for Link Prediction

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
The slashdot zoo: mining a social network with negative edges

Proceedings of the 18th international conference on World wide web
Non-monotonic feature selection

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
New perspectives and methods in link prediction

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Supervised random walks: predicting and recommending links in social networks

Proceedings of the fourth ACM international conference on Web search and data mining
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
TAKES: a fast method to select features in the kernel space

Proceedings of the 20th ACM international conference on Information and knowledge management
Who will follow you back?: reciprocal relationship prediction

Proceedings of the 20th ACM international conference on Information and knowledge management
Link prediction: the power of maximal entropy random walk

Proceedings of the 20th ACM international conference on Information and knowledge management
Temporal link prediction by integrating content and structure information

Proceedings of the 20th ACM international conference on Information and knowledge management

PIKM 2012: 5th ACM workshop for PhD students in information and knowledge management

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Networks that model relationships in the real world have attracted much attention in the past few years. Link prediction plays a central role in the network area. Supervised learning is an important class of algorithms used to address the link prediction problem. A big challenge in solving link prediction tasks is deciding how to choose relevant features. As an important machine learning technique to select relevant features, feature selection not only enhances classification accuracy, but also improves the efficiency of the training process. Thus, it is especially relevant for link prediction. However, to the best of our knowledge, feature selection under the link prediction scenario remains unstudied. In this paper, we propose FEature Selection for Link Prediction (FESLP), which contains a feature ranking algorithm and a feature weighting algorithm to address link prediction tasks. We measure the discriminative ability of each individual feature and select those features with greatest discriminative power. Simultaneously, we aim to minimize the correlations among features such that redundancy in the learned feature space is as small as possible. Thus, the feature space can accurately preserve the sketch of the original data. Feature weighting and feature ranking problems can be formalized as two quadratic optimization problems. The active set method is used to solve the feature weighting problem (via real number programming) while a greedy policy is applied to solve the feature ranking problem (via integer programming). In experiments, We evaluate FESLP on six large-scale email network datasets from a university. The results show the effectiveness of the FESLP.