Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Database-friendly random projections: Johnson-Lindenstrauss with binary coins
Journal of Computer and System Sciences - Special issu on PODS 2001
Efficient svm training using low-rank kernel representations
The Journal of Machine Learning Research
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Fast and space efficient string kernels using suffix arrays
ICML '06 Proceedings of the 23rd international conference on Machine learning
Binet-Cauchy Kernels on Dynamical Systems and its Application to the Analysis of Dynamic Scenes
International Journal of Computer Vision
Unifying divergence minimization and statistical inference via convex duality
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Self-taught hashing for fast similarity search
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Training and Testing Low-degree Polynomial Data Mappings via Linear SVM
The Journal of Machine Learning Research
Very high accuracy and fast dependency parsing is not a contradiction
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
A high-performance syntactic and semantic dependency parser
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations
K-means clustering with feature hashing
HLT-SS '11 Proceedings of the ACL 2011 Student Session
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast support vector machines for structural Kernels
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
BitShred: feature hashing malware for scalable triage and semantic analysis
Proceedings of the 18th ACM conference on Computer and communications security
Approximate scalable bounded space sketch for large data NLP
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Dimensionality reduction via compressive sensing
Pattern Recognition Letters
The best of both worlds: a graph-based completion model for transition-based parsers
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Sketch algorithms for estimating point queries in NLP
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Fast top-k similarity queries via matrix compression
Proceedings of the 21st ACM international conference on Information and knowledge management
Sketching via hashing: from heavy hitters to compressed sensing to sparse fourier transform
Proceedings of the 32nd symposium on Principles of database systems
Quality and efficiency for kernel density estimates in large data
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Sally: a tool for embedding strings in vector spaces
The Journal of Machine Learning Research
A close look on n-grams in intrusion detection: anomaly detection vs. classification
Proceedings of the 2013 ACM workshop on Artificial intelligence and security
b-bit minwise hashing in practice
Proceedings of the 5th Asia-Pacific Symposium on Internetware
Fast linearization of tree kernels over large-scale data
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
A statistical tagger for morphological tagging of Russian language texts
Automation and Remote Control
Hi-index | 0.00 |
We propose hashing to facilitate efficient kernels. This generalizes previous work using sampling and we show a principled way to compute the kernel matrix for data streams and sparse feature spaces. Moreover, we give deviation bounds from the exact kernel matrix. This has applications to estimation on strings and graphs.