Kendall's advanced theory of statistics
Kendall's advanced theory of statistics
The power of amnesia: learning probabilistic automata with variable memory length
Machine Learning - Special issue on COLT '94
Improved Smoothing for Probabilistic Suffix Trees Seen as Variable Order Markov Chains
ECML '02 Proceedings of the 13th European Conference on Machine Learning
The context-tree weighting method: basic properties
IEEE Transactions on Information Theory
Time discretisation applied to anomaly detection in a marine engine
KES'07/WIRN'07 Proceedings of the 11th international conference, KES 2007 and XVII Italian workshop on neural networks conference on Knowledge-based intelligent information and engineering systems: Part I
A suffix tree based prediction scheme for pervasive computing environments
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
Hi-index | 0.10 |
This paper presents a statistical test and algorithms for patterns extraction and supervised classification of sequential data. First it defines the notion of prediction suffix tree (PST). This type of tree can be used to efficiently describe variable order chain. It performs better than the Markov chain of order L and at a lower storage cost. We propose an improvement of this model, based on a statistical test. This test enables us to control the risk of encountering different patterns in the model of the sequence to classify and in the model of its class. Applications to biological sequences are presented to illustrate this procedure. We compare the results obtained with different models (Markov chain of order L, Variable order model and the statistical test, with or without smoothing). We set out to show how the choice of the parameters of the models influences performance in these applications. Obviously these algorithms can be used in other fields in which the data are naturally ordered.