A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
International Journal of Computer Vision
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
The nature of statistical learning theory
The nature of statistical learning theory
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Nonlinear component analysis as a kernel eigenvalue problem
Neural Computation
Prior knowledge in support vector kernels
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
Classification on pairwise proximity data
Proceedings of the 1998 conference on Advances in neural information processing systems II
Classification with Nonmetric Distances: Image Retrieval and Class Representation
IEEE Transactions on Pattern Analysis and Machine Intelligence
The 1999 DARPA off-line intrusion detection evaluation
Computer Networks: The International Journal of Computer and Telecommunications Networking - Special issue on recent advances in intrusion detection systems
A vector space model for automatic indexing
Communications of the ACM
Communications of the ACM
ACM Transactions on Information and System Security (TISSEC)
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
A new discriminative kernel from probabilistic models
Neural Computation
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Kernel-based nonlinear blind source separation
Neural Computation
Text classification using string kernels
The Journal of Machine Learning Research
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Optimal Cluster Preserving Embedding of Nonmetric Proximity Data
IEEE Transactions on Pattern Analysis and Machine Intelligence
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Replacing suffix trees with enhanced suffix arrays
Journal of Discrete Algorithms - SPIRE 2002
Distance--Based Classification with Lipschitz Functions
The Journal of Machine Learning Research
Learning with non-positive kernels
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Feature Discovery in Non-Metric Pairwise Data
The Journal of Machine Learning Research
Kernels and Distances for Structured Data
Machine Learning
Fast String Kernels using Inexact Matching for Protein Sequences
The Journal of Machine Learning Research
Feature Space Interpretation of SVMs with Indefinite Kernels
IEEE Transactions on Pattern Analysis and Machine Intelligence
Efficient Computation of Gapped Substring Kernels on Large Alphabets
The Journal of Machine Learning Research
Fast and space efficient string kernels using suffix arrays
ICML '06 Proceedings of the 23rd international conference on Machine learning
Introduction to Automata Theory, Languages, and Computation (3rd Edition)
Introduction to Automata Theory, Languages, and Computation (3rd Edition)
Bioinformatics
On the information and representation of non-Euclidean pairwise data
Pattern Recognition
From outliers to prototypes: Ordering data
Neurocomputing
An efficient, versatile approach to suffix sorting
Journal of Experimental Algorithmics (JEA)
Efficient algorithms for similarity measures over sequential data: a look beyond kernels
DAGM'06 Proceedings of the 28th conference on Pattern Recognition
Building kernels from binary strings for image matching
IEEE Transactions on Image Processing
Support vector machines for spam categorization
IEEE Transactions on Neural Networks
Support vector machines for histogram-based image classification
IEEE Transactions on Neural Networks
Model complexity control for regression using VC generalization bounds
IEEE Transactions on Neural Networks
An introduction to kernel-based learning algorithms
IEEE Transactions on Neural Networks
Learning and Classification of Malware Behavior
DIMVA '08 Proceedings of the 5th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
A Self-learning System for Detection of Anomalous SIP Messages
Principles, Systems and Applications of IP Telecommunications. Services and Security for Next Generation Networks
Incorporation of Application Layer Protocol Syntax into Anomaly Detection
ICISS '08 Proceedings of the 4th International Conference on Information Systems Security
A framework for quantitative security analysis of machine learning
Proceedings of the 2nd ACM workshop on Security and artificial intelligence
TokDoc: a self-healing web application firewall
Proceedings of the 2010 ACM Symposium on Applied Computing
Intrusion detection in sensor networks using clustering and immune systems
IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
Integrated Computer-Aided Engineering
Eliminating routing protocol anomalies in wireless sensor networks using AI techniques
Proceedings of the 3rd ACM workshop on Artificial intelligence and security
Towards early warning systems: challenges, technologies and architecture
CRITIS'09 Proceedings of the 4th international conference on Critical information infrastructures security
Cujo: efficient detection and prevention of drive-by-download attacks
Proceedings of the 26th Annual Computer Security Applications Conference
Data-intensive document clustering on graphics processing unit (GPU) clusters
Journal of Parallel and Distributed Computing
ASAP: automatic semantics-aware analysis of network payloads
PSDML'10 Proceedings of the international ECML/PKDD conference on Privacy and security issues in data mining and machine learning
Improving reputation systems for wireless sensor networks using genetic algorithms
Proceedings of the 13th annual conference on Genetic and evolutionary computation
Detecting unknown attacks in wireless sensor networks using clustering techniques
HAIS'11 Proceedings of the 6th international conference on Hybrid artificial intelligent systems - Volume Part I
Detecting bad-mouthing attacks on reputation systems using self-organizing maps
CISIS'11 Proceedings of the 4th international conference on Computational intelligence in security for information systems
Vulnerability extrapolation: assisted discovery of vulnerabilities using machine learning
WOOT'11 Proceedings of the 5th USENIX conference on Offensive technologies
Unsupervised video surveillance
ACCV'10 Proceedings of the 2010 international conference on Computer vision - Volume Part I
Static detection of malicious JavaScript-bearing PDF documents
Proceedings of the 27th Annual Computer Security Applications Conference
Using self-organizing maps for intelligent camera-based user interfaces
HAIS'10 Proceedings of the 5th international conference on Hybrid Artificial Intelligence Systems - Volume Part II
Self-Organizing maps versus growing neural gas in detecting data outliers for security applications
HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part II
Early detection of malicious behavior in JavaScript code
Proceedings of the 5th ACM workshop on Security and artificial intelligence
Learning stateful models for network honeypots
Proceedings of the 5th ACM workshop on Security and artificial intelligence
Similarity measures for sequential data
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Learning common behaviors from large sets of unlabeled temporal series
Image and Vision Computing
Human action recognition in video by fusion of structural and spatio-temporal features
SSPR'12/SPR'12 Proceedings of the 2012 Joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Sally: a tool for embedding strings in vector spaces
The Journal of Machine Learning Research
Security analysis of online centroid anomaly detection
The Journal of Machine Learning Research
Toward supervised anomaly detection
Journal of Artificial Intelligence Research
Chucky: exposing missing checks in source code for vulnerability discovery
Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security
Hi-index | 0.00 |
Efficient and expressive comparison of sequences is an essential procedure for learning with sequential data. In this article we propose a generic framework for computation of similarity measures for sequences, covering various kernel, distance and non-metric similarity functions. The basis for comparison is embedding of sequences using a formal language, such as a set of natural words, k-grams or all contiguous subsequences. As realizations of the framework we provide linear-time algorithms of different complexity and capabilities using sorted arrays, tries and suffix trees as underlying data structures. Experiments on data sets from bioinformatics, text processing and computer security illustrate the efficiency of the proposed algorithms---enabling peak performances of up to 106 pairwise comparisons per second. The utility of distances and non-metric similarity measures for sequences as alternatives to string kernels is demonstrated in applications of text categorization, network intrusion detection and transcription site recognition in DNA.