Efficient Hardware Hashing Functions for High Performance Computers
IEEE Transactions on Computers
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Biosequence Similarity Search on the Mercury System
Journal of VLSI Signal Processing Systems
High throughput filtering using FPGA-acceleration
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
N-Gram (n-character sequences in text documents) counting is a well-established technique used in classifying the language of text in a document. In this paper, n-gram processing is accelerated through the use of reconfigurable hardware on the XtremeData XD1000 system. Our design employs parallelism at multiple levels, with parallel Bloom Filters accessing on-chip RAM, parallel language classifiers, and parallel document processing. In contrast to another hardware implementation (HAIL algorithm) that uses off-chip SRAM for lookup, our highly scalable implementation uses only on-chip memory blocks. Our implementation of end-to-end language classification runs at 85x comparable software and 1.45x the competing hardware design.