Information filtering and information retrieval: two sides of the same coin?
Communications of the ACM - Special issue on information filtering
Modern Information Retrieval
Language classification using n-grams accelerated by FPGA-based Bloom filters
HPRCTA '07 Proceedings of the 1st international workshop on High-performance reconfigurable computing technology and applications: held in conjunction with SC07
Using graphics processors for high performance IR query processing
Proceedings of the 18th international conference on World wide web
Developing energy efficient filtering systems
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
Combining naive bayes and n-gram language models for text classification
ECIR'03 Proceedings of the 25th European conference on IR research
Throughput analysis for a high-performance FPGA-accelerated real-time search application
International Journal of Reconfigurable Computing - Special issue on High-Performance Reconfigurable Computing
Evaluating FPGA-acceleration for real-time unstructured search
ISPASS '12 Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software
Optimized private information retrieval using graphics processing unit with reduced accessibility
Proceedings of the CUBE International Information Technology Conference
High-Performance Computing Using FPGAs
High-Performance Computing Using FPGAs
Hi-index | 0.00 |
With the rise in the amount information of being streamed across networks, there is a growing demand to vet the quality, type and content itself for various purposes such as spam, security and search. In this paper, we develop an energy-efficient high performance information filtering system that is capable of classifying a stream of incoming document at high speed. The prototype parses a stream of documents using a multicore CPU and then performs classification using Field-Programmable Gate Arrays (FPGAs). On a large TREC data collection, we implemented a Naive Bayes classifier on our prototype and compared it to an optimized CPU based-baseline. Our empirical findings show that we can classify documents at 10Gb/s which is up to 94 times faster than the CPU baseline (and up to 5 times faster than previous FPGA based implementations). In future work, we aim to increase the throughput by another order of magnitude by implementing both the parser and filter on the FPGA.