Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
QuickCheck: a lightweight tool for random testing of Haskell programs
ICFP '00 Proceedings of the fifth ACM SIGPLAN international conference on Functional programming
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Optimal Exact Strring Matching Based on Suffix Arrays
SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
Replacing suffix trees with enhanced suffix arrays
Journal of Discrete Algorithms - SPIRE 2002
RBR: library-less repeat detection for ESTs
Bioinformatics
Less hashing, same performance: building a better bloom filter
ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
Linear pattern matching algorithms
SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
PADL'07 Proceedings of the 9th international conference on Practical Aspects of Declarative Languages
Space and time efficient parallel algorithms and software for EST clustering
IEEE Transactions on Parallel and Distributed Systems
Don't thrash: how to cache your hash on flash
HotStorage'11 Proceedings of the 3rd USENIX conference on Hot topics in storage and file systems
Don't thrash: how to cache your hash on flash
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Analysis of biological data often involves large data sets and computationally expensive algorithms. Databases of biological data continue to grow, leading to an increasing demand for improved algorithms and data structures. Despite having many advantages over more traditional indexing structures, the Bloom filter is almost unused in bioinformatics. Here we present a robust and efficient Bloom filter implementation in Haskell, and implement a simple bioinformatics application for indexing and matching sequence data. We use this to index the chromosomes that make up the human genome, and map all available gene sequences to it. Our experiences with developing and tuning our application suggest that for bioinformatics applications, Haskell offers a compelling combination of rapid development, quality assurance, and high performance.