Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Exploiting generative models in discriminative classifiers
Proceedings of the 1998 conference on Advances in neural information processing systems II
On the algorithmic implementation of multiclass kernel-based vector machines
The Journal of Machine Learning Research
The Journal of Machine Learning Research
The Journal of Machine Learning Research
Profile-Based String Kernels for Remote Homology Detection and Motif Extraction
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Mining recent temporal patterns for event detection in multivariate time series data
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
A family of feed-forward models for protein sequence classification
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
Sequence classification is central to many practical problems within machine learning. Distances metrics between arbitrary pairs of sequences can be hard to define because sequences can vary in length and the information contained in the order of sequence elements is lost when standard metrics such as Euclidean distance are applied. We present a scheme that employs a Hidden Markov Model variant to produce a set of fixed-length description vectors from a set of sequences. We then define three inference algorithms, a Baum-Welch variant, a Gibbs Sampling algorithm, and a variational algorithm, to infer model parameters. Finally, we show experimentally that the fixed length representation produced by these inference methods is useful for classifying sequences of amino acids into structural classes.