An Iterative Learning Algorithm for Within-Network Regression in the Transductive Setting

Authors:
Annalisa Appice;Michelangelo Ceci;Donato Malerba
Affiliations:
Dipartimento di Informatica, Università degli Studi di Bari, Bari, Italy 70126;Dipartimento di Informatica, Università degli Studi di Bari, Bari, Italy 70126;Dipartimento di Informatica, Università degli Studi di Bari, Bari, Italy 70126
Venue:
DS '09 Proceedings of the 12th International Conference on Discovery Science
Year:
2009

Citing 10
Cited 1

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Why collective inference improves relational classification

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient co-regularised least squares regression

ICML '06 Proceedings of the 23rd international conference on Machine learning
Relational Dependency Networks

The Journal of Machine Learning Research
Classification in Networked Data: A Toolkit and a Univariate Case Study

The Journal of Machine Learning Research
Semisupervised Regression with Cotraining-Style Algorithms

IEEE Transactions on Knowledge and Data Engineering
Using ghost edges for classification in sparsely labeled networks

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Stepwise Induction of Multi-target Model Trees

ECML '07 Proceedings of the 18th European conference on Machine Learning
Improving learning in networked data by combining explicit and mined links

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
A brief survey of machine learning methods for classification in networked data and an application to suspicion scoring

ICML'06 Proceedings of the 2006 conference on Statistical network analysis

Network regression with predictive clustering trees

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

Within-network regression addresses the task of regression in partially labeled networked data where labels are sparse and continuous. Data for inference consist of entities associated with nodes for which labels are known and interlinked with nodes for which labels must be estimated. The premise of this work is that many networked datasets are characterized by a form of autocorrelation where values of the response variable in a node depend on values of the predictor variables of interlinked nodes. This autocorrelation is a violation of the independence assumption of observation. To overcome to this problem, the lagged predictor variables are added to the regression model. We investigate a computational solution for this problem in the transductive setting, which asks for predicting the response values only for unlabeled nodes of the network. The neighborhood relation is computed on the basis of the node links. We propose a regression inference procedure that is based on a co-training approach according to separate model trees are learned from both attribute values of labeled nodes and attribute values aggregated in the neighborhood of labeled nodes, respectively. Each model tree is used to label the unlabeled nodes for the other during an iterative learning process. The set of labeled data is changed by including labels which are estimated as confident. The confidence estimate is based on the influence of the predicted labels on known labels of interlinked nodes. Experiments with sparsely labeled networked data show that the proposed method improves traditional model tree induction.