An evaluation of retrieval effectiveness for a full-text document-retrieval system
Communications of the ACM
ACM Computing Surveys (CSUR) - Annals of discrete mathematics, 24
Parallel free-text search on the connection machine system
Communications of the ACM - Special issue on parallelism
The C programming language
The C++ programming language (2nd ed.)
The C++ programming language (2nd ed.)
Operational characteristics of a harware-based pattern matcher
ACM Transactions on Database Systems (TODS)
Signature files: an access method for documents and its analytical performance evaluation
ACM Transactions on Information Systems (TOIS)
A fast string searching algorithm
Communications of the ACM
Implementation of the substring test by hashing
Communications of the ACM
Advanced Database Machine Architecture
Advanced Database Machine Architecture
The Fast Data Finder - An Architecture for Very High Speed Data Search and Dissemination
Proceedings of the First International Conference on Data Engineering
An associative/parallel processor for partial match retrieval using superimposed codes
ISCA '80 Proceedings of the 7th annual symposium on Computer Architecture
Associative/parallel processors for searching very large textual data bases
CAW '77 Proceedings of the 3rd workshop on Computer architecture : Non-numeric processing
Hardware for searching very large text databases
Hardware for searching very large text databases
A recursive MISD architecture for pattern matching
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Processor Array Architectures for Deep Packet Classification
IEEE Transactions on Parallel and Distributed Systems
A neural network string matcher
CAIP'07 Proceedings of the 12th international conference on Computer analysis of images and patterns
A bibliography on computational molecular biology and genetics
Mathematical and Computer Modelling: An International Journal
Hi-index | 14.98 |
The authors introduce special heuristics to the Knuth-Morris-Pratt algorithm to reduce the time and space required to perform the string matching. They compare their hardware-based approach to the software approaches embodied in the Unix system grep and fgrep commands. Simulation results show that the hardware approach can provide a 25-500-fold performance improvement, depending on the complexity of the query, and that it is fast enough, even in the presence of variable-length 'don't cares' to keep up with a 20-million character/second disk. The approach compares favorably to other hardware designs in speed and space. The proposed hardware implementation requires 10 kB of one cycle static memory, 28 single-character comparators, four 16-b adders, and control logic for four finite-state machines with a term-matcher controller. After that, additional hardware produces negligible performance improvements for queries with up to 80 terms, about half of which have variable-length 'don't cares'.