Zipping Out Relevant Information
Computing in Science and Engineering
Towards parameter-free data mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Compression-based data mining of sequential data
Data Mining and Knowledge Discovery
Artificial Intelligence Review
Catching the Drift: Using Feature-Free Case-Based Reasoning for Spam Filtering
ICCBR '07 Proceedings of the 7th international conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
Sublinear Algorithms for Approximating String Compressibility
APPROX '07/RANDOM '07 Proceedings of the 10th International Workshop on Approximation and the 11th International Workshop on Randomization, and Combinatorial Optimization. Algorithms and Techniques
ACM Transactions on Information and System Security (TISSEC)
A Compression-Based Method for Stemmatic Analysis
Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
On compression-based text classification
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Hi-index | 0.00 |
Inductive learning methods, such as neural networks and decision trees, have become a popular approach to developing DNA sequence identification tools. Such methods attempt to form models of a collection of training data that can be used to predict future data accurately. The common approach to using such methods on DNA sequence identification problems forms models that depend on the {\em absolute locations} of nucleotides and assume {\em independence} of consecutive nucleotide locations. This paper describes a new class of learning methods, called {\em compression-based induction} (CBI), that is geared towards sequence learning problems such as those that arise when learning DNA sequences. The central idea is to use text compression techniques on DNA sequences as the means for generalizing