C4.5: programs for machine learning
C4.5: programs for machine learning
Efficient enumeration of frequent sequences
Proceedings of the seventh international conference on Information and knowledge management
Feature generation for sequence categorization
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Mining features for sequence classification
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining long sequential patterns in a noisy environment
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Information Processing and Technology
Information Processing and Technology
Discovery of Frequent Episodes in Event Sequences
Data Mining and Knowledge Discovery
SLIQ: A Fast Scalable Classifier for Data Mining
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth
Proceedings of the 17th International Conference on Data Engineering
The PSP Approach for Mining Sequential Patterns
PKDD '98 Proceedings of the Second European Symposium on Principles of Data Mining and Knowledge Discovery
Finding surprising patterns in a time series database in linear time and space
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
CMP: A Fast Decision Tree Classifier Using Multivariate Predictions
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Probabilistic discovery of time series motifs
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining compressed frequent-pattern sets
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Data Mining and Knowledge Discovery Handbook
Data Mining and Knowledge Discovery Handbook
Fast time series classification using numerosity reduction
ICML '06 Proceedings of the 23rd international conference on Machine learning
Reducing SVM classification time using multiple mirror classifiers
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Multiscale Classification Using Nearest Neighbor Density Estimates
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
A Novel Similarity-Based Fuzzy Clustering Algorithm by Integrating PCM and Mountain Method
IEEE Transactions on Fuzzy Systems
Temporal Data Classification Using Linear Classifiers
ADBIS '09 Proceedings of the 13th East European Conference on Advances in Databases and Information Systems
Temporal data classification using linear classifiers
Information Systems
Robust approach for estimating probabilities in Naïve-Bayes Classifier for gene expression data
Expert Systems with Applications: An International Journal
Hi-index | 12.05 |
Data classification is an important topic in the field of data mining due to its wide applications. A number of related methods have been proposed based on the well-known learning models such as decision tree or neural network. Although data classification was widely discussed, relatively few studies explored the topic of temporal data classification. Most of the existing researches focused on improving the accuracy of classification by using statistical models, neural network, or distance-based methods. However, they cannot interpret the results of classification to users. In many research cases, such as gene expression of microarray, users prefer the classification information above a classifier only with a high accuracy. In this paper, we propose a novel pattern-based data mining method, namely classify-by-sequence (CBS), for classifying large temporal datasets. The main methodology behind the CBS is integrating sequential pattern mining with probabilistic induction. The CBS has the merit of simplicity in implementation and its pattern-based architecture can supply clear classification information to users. Through experimental evaluation, the CBS was shown to deliver classification results with high accuracy under two real time series datasets. In addition, we designed a simulator to evaluate the performance of CBS under datasets with different characteristics. The experimental results show that CBS can discover the hidden patterns and classify data effectively by utilizing the mined sequential patterns.