Enhancing learning algorithms to support data with short sequence features by automated feature discovery

  • Authors:
  • Ofer Dor;Yoram Reich

  • Affiliations:
  • -;-

  • Venue:
  • Knowledge-Based Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a VECtor DIScovery approach, called VECDIS, which enhances the learning performance of existing classifiers directly from various data types and is able to discover features made of multiple feature types for explanatory purposes. The data types could be combinations of multivariate, short time-series or short sequential data. The features in the dataset could have single item or/and a list of ordered items of different sizes. The present approach allows handling raw vector data without prior manipulation (i.e., preprocessing). The discovered features are made of vector and non-vector mathematical relations. The algorithm generates new vector features and mathematical expression features that are transmitted or exchanged with previously generated features, to the next iterative step. The approach is able to search and automatically discover thousands of different features (sequence manipulation), performed on the sequence features. We performed large number of experiments with various synthetic and simulated datasets and with a wide range of classifiers. The results show that VECDIS enhanced significantly the classification performance of existing classifiers to handle datasets having multiple feature types with short sequence features. Nevertheless, there is no guarantee that the mathematical library as presented in this paper is suitable to all sequence datasets and would lead to discovering a valuable feature set. Therefore, VECDIS enables expanding or exchanging the mathematical library as desire.