Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Why collective inference improves relational classification
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient co-regularised least squares regression
ICML '06 Proceedings of the 23rd international conference on Machine learning
Relational Dependency Networks
The Journal of Machine Learning Research
Classification in Networked Data: A Toolkit and a Univariate Case Study
The Journal of Machine Learning Research
Semisupervised Regression with Cotraining-Style Algorithms
IEEE Transactions on Knowledge and Data Engineering
Using ghost edges for classification in sparsely labeled networks
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Stepwise Induction of Multi-target Model Trees
ECML '07 Proceedings of the 18th European conference on Machine Learning
Improving learning in networked data by combining explicit and mined links
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
ICML'06 Proceedings of the 2006 conference on Statistical network analysis
Network regression with predictive clustering trees
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Hi-index | 0.00 |
Within-network regression addresses the task of regression in partially labeled networked data where labels are sparse and continuous. Data for inference consist of entities associated with nodes for which labels are known and interlinked with nodes for which labels must be estimated. The premise of this work is that many networked datasets are characterized by a form of autocorrelation where values of the response variable in a node depend on values of the predictor variables of interlinked nodes. This autocorrelation is a violation of the independence assumption of observation. To overcome to this problem, the lagged predictor variables are added to the regression model. We investigate a computational solution for this problem in the transductive setting, which asks for predicting the response values only for unlabeled nodes of the network. The neighborhood relation is computed on the basis of the node links. We propose a regression inference procedure that is based on a co-training approach according to separate model trees are learned from both attribute values of labeled nodes and attribute values aggregated in the neighborhood of labeled nodes, respectively. Each model tree is used to label the unlabeled nodes for the other during an iterative learning process. The set of labeled data is changed by including labels which are estimated as confident. The confidence estimate is based on the influence of the predicted labels on known labels of interlinked nodes. Experiments with sparsely labeled networked data show that the proposed method improves traditional model tree induction.