Multivariate stream data classification using simple text classifiers

Authors:
Sungbo Seo;Jaewoo Kang;Dongwon Lee;Keun Ho Ryu
Affiliations:
Dept. of Computer Science, Chungbuk National University, Chungbuk, Korea;Dept. of Computer Science and Engineering, Korea University, Seoul, Korea;College of Information Sciences and Technology, Penn State University, PA;Dept. of Computer Science, Chungbuk National University, Chungbuk, Korea
Venue:
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Year:
2006

Citing 12
Cited 1

An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Data mining: concepts and techniques

Data mining: concepts and techniques
Wireless sensor networks for habitat monitoring

WSNA '02 Proceedings of the 1st ACM international workshop on Wireless sensor networks and applications
Pattern Extraction for Time Series Classification

PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
Querying Shapes of Histories

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
A symbolic representation of time series, with implications for streaming algorithms

DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Time-series prediction with applications to traffic and moving objects databases

Proceedings of the 3rd ACM international workshop on Data engineering for wireless and mobile access
Mining concept-drifting data streams using ensemble classifiers

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
On demand classification of data streams

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Classification of Multivariate Time Series and Structured Data Using Constructive Induction

Machine Learning
Comparative study of name disambiguation problem using a scalable blocking-based framework

Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Comparing Bayesian network classifiers

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence

Adaptive pattern mining model for early detection of botnet-propagation scale

Security and Communication Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

We introduce a classification framework for continuous multivariate stream data. The proposed approach works in two steps. In the preprocessing step, it takes as input a sliding window of multivariate stream data and discretizes the data in the window into a string of symbols that characterize the signal changes. In the classification step, it uses a simple text classification algorithm to classify the discretized data in the window. We evaluated both supervised and unsupervised classification algorithms. For supervised, we tested Naïve Bayes Model and SVM, and for unsupervised, we tested Jaccard, TFIDF, Jaro and JaroWinkler. In our experiments, SVM and TFIDF outperformed the other classification methods. In particular, we observed that classification accuracy is improved when the correlation of attributes is also considered along with the n-gram tokens of symbols.