Label-dependent node classification in the network

  • Authors:
  • Przemyslaw Kazienko;Tomasz Kajdanowicz

  • Affiliations:
  • Wroclaw University of Technology, Wybrzeze Wyspianskiego 27, Wroclaw, Poland;Wroclaw University of Technology, Wybrzeze Wyspianskiego 27, Wroclaw, Poland

  • Venue:
  • Neurocomputing
  • Year:
  • 2012

Quantified Score

Hi-index 0.01

Visualization

Abstract

Relations between objects in various systems, such as hyperlinks connecting web pages, citations of scientific papers, conversations via email or social interactions in Web 2.0 portals are commonly modeled by networks. One of many interesting problems currently studied for such domains is node classification. Due to the nature of the networked data and the unavailability of collection of nodes' broad representation for training in majority of environments, only a very limited data may remain useful for classification. Therefore, there is a need for accurate and efficient algorithms that are able to perform good classification based only on scanty knowledge of network nodes. A new approach of sampling algorithm-LDGibbs, used in the context of collective classification with application of label-dependent features, is proposed in the paper in order to provide more accurate generalization for sparse datasets. Additionally, a new LDBootstrapping algorithm based on label-dependent features has been developed. Both new algorithms include additional steps to extract new input features based on graph structures but limited only to the nodes of a given label. It means that a separate set of structural features is provided for each label. The comparison with the other approaches, in particular with standard Gibbs Sampling and bootstrapping provided satisfactory results and revealed LDGibbs's superiority.