Hacker's Delight
A case study in SIMD text processing with parallel bit streams: UTF-8 to UTF-16 transcoding
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Converting to and from Dilated Integers
IEEE Transactions on Computers
High performance XML parsing using parallel bit stream technology
CASCON '08 Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of minds
Bitslice implementation of AES
CANS'06 Proceedings of the 5th international conference on Cryptology and Network Security
Proceedings of the 24th ACM International Conference on Supercomputing
Navigating big data with high-throughput, energy-efficient data partitioning
Proceedings of the 40th Annual International Symposium on Computer Architecture
Hi-index | 0.00 |
Parallel bit stream algorithms exploit the SWAR (SIMD within a register) capabilities of commodity processors in high-performance text processing applications such as UTF-8 to UTF-16 transcoding, XML parsing, string search and regular expression matching. Direct architectural support for these algorithms in future SWAR instruction sets could further increase performance as well as simplifying the programming task. A set of simple SWAR instruction set extensions are proposed for this purpose based on the principle of systematic support for inductive doubling as an algorithmic technique. These extensions are shown to significantly reduce instruction count in core parallel bit stream algorithms, often providing a 3X or better improvement. The extensions are also shown to be useful for SWAR programming in other application areas, including providing a systematic treatment for horizontal operations. An implementation model for these extensions involves relatively simple circuitry added to the operand fetch components in a pipelined processor.