The nature of statistical learning theory
The nature of statistical learning theory
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
The compleat patricia.
Text classification using string kernels
The Journal of Machine Learning Research
Proceedings of the 4th ACM workshop on Security and artificial intelligence
Hi-index | 0.00 |
The ability to automatically classify files based on their low-level, short-range structures is of particular importance in computer forensics. We report a study on the automatic learning of file classification using byte sub-stream kernels that capture these low-level structures. We automatically discover byte-level patterns in a file by extracting a byte sequence feature map and use a suffix trie data structure to efficiently store and manipulate the feature map. Using the feature map we compute the spectrum kernel and, together with a support vector machine classifier algorithm, we are able to efficiently categorize a variety of different system and application file types. Experiments have provided good file classification performance results.