Feature enrichment and selection for transductive classification on networked data

Authors:
Zehra Cataltepe;Abdullah Sonmez;Baris Senliol
Affiliations:
-;-;-
Venue:
Pattern Recognition Letters
Year:
2014

Citing 14
Cited 0

Enhanced hypertext categorization using hyperlinks

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Unlabeled Data Can Degrade Classification Performance of Generative Classifiers

Proceedings of the Fifteenth International Florida Artificial Intelligence Research Society Conference
An introduction to variable and feature selection

The Journal of Machine Learning Research
Cluster-based concept invention for statistical relational learning

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy

IEEE Transactions on Pattern Analysis and Machine Intelligence
Distribution-based aggregation for relational learning with identifier attributes

Machine Learning
Classification in Networked Data: A Toolkit and a Univariate Case Study

The Journal of Machine Learning Research
Graph-Based Semisupervised Learning

IEEE Transactions on Pattern Analysis and Machine Intelligence
Analyzing Co-training Style Algorithms

ECML '07 Proceedings of the 18th European conference on Machine Learning
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Co-training with relevant random subspaces

Neurocomputing
Graph regularized transductive classification on heterogeneous information networks

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Collective classification using heterogeneous classifiers

MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition

Quantified Score

Hi-index	0.10

Visualization

Abstract

Networked data consist of nodes and links between the nodes which indicate their dependencies. Nodes have content features which are available for all the data; on the other hand, the labels are available only for the training data. Given the features for all the nodes and labels for training nodes, in transductive classification, labels for all remaining nodes are predicted. Learning algorithms that use both node content features and links have been developed. For example, collective classification algorithms use aggregated (such as sum or average of) labels of neighbors, in addition to node features, as inputs to a classifier. The classifier is trained using the training data only. When testing, since the neighbors' labels are used as classifier inputs, the labels for the test set need to be determined through an iterative procedure. While it is usually very difficult to obtain labels on the whole dataset, features are usually easier to obtain. In this paper, we introduce a new method of transductive network classification which can use the test node features when training the classifier. We train our classifier using enriched node features. The enriched node features include, in addition to the node's own features, the aggregated neighbors' features and aggregation of node and neighbor features passed through simple logical operators OR and AND. Enriched features may contain irrelevant or redundant features, which could decrease classifier performance. Therefore, we employ feature selection to determine whether a feature among the set of enriched features should be used for classifier training or not. Our feature selection method, called FCBF#, is a mutual information based, filter type, fast, feature selection method. Experimental results on three different network datasets show that classification accuracies obtained using network enriched and selected features are comparable or better than content only or collective classification.