A general multi-relational classification approach using feature generation and selection

  • Authors:
  • Miao Zou;Tengjiao Wang;Hongyan Li;Dongqing Yang

  • Affiliations:
  • School of Electronics Engineering and Computer Science, Peking University, Beijing, China;School of Electronics Engineering and Computer Science, Peking University, Beijing, China;School of Electronics Engineering and Computer Science, Peking University, Beijing, China;School of Electronics Engineering and Computer Science, Peking University, Beijing, China

  • Venue:
  • ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Multi-relational classification is an important data mining task, since much real world data is organized in multiple relations. The major challenges come from, firstly, the large high dimensional search spaces due to many attributes in multiple relations and, secondly, the high computational cost in feature selection and classifier construction due to the high complexity in the structure of multiple relations. The existing approaches mainly use the inductive logic programming (ILP) techniques to derive hypotheses or extract features for classification. However, those methods often are slow and sometimes cannot provide enough information to build effective classifiers. In this paper, we develop a general approach for accurate and fast multi-relational classification using feature generation and selection. Moreover, we propose a novel similarity-based feature selection method for multi-relational classification. An extensive performance study on several benchmark data sets indicates that our approach is accurate, fast and highly scalable.