Multivariate stream data classification using simple text classifiers

  • Authors:
  • Sungbo Seo;Jaewoo Kang;Dongwon Lee;Keun Ho Ryu

  • Affiliations:
  • Dept. of Computer Science, Chungbuk National University, Chungbuk, Korea;Dept. of Computer Science and Engineering, Korea University, Seoul, Korea;College of Information Sciences and Technology, Penn State University, PA;Dept. of Computer Science, Chungbuk National University, Chungbuk, Korea

  • Venue:
  • DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce a classification framework for continuous multivariate stream data. The proposed approach works in two steps. In the preprocessing step, it takes as input a sliding window of multivariate stream data and discretizes the data in the window into a string of symbols that characterize the signal changes. In the classification step, it uses a simple text classification algorithm to classify the discretized data in the window. We evaluated both supervised and unsupervised classification algorithms. For supervised, we tested Naïve Bayes Model and SVM, and for unsupervised, we tested Jaccard, TFIDF, Jaro and JaroWinkler. In our experiments, SVM and TFIDF outperformed the other classification methods. In particular, we observed that classification accuracy is improved when the correlation of attributes is also considered along with the n-gram tokens of symbols.