Accelerating search and recognition workloads with SSE 4.2 string and text processing instructions

  • Authors:
  • Guangyu Shi;Min Li;Mikko Lipasti

  • Affiliations:
  • University of Wisconsin-Madison;University of Wisconsin-Madison;University of Wisconsin-Madison

  • Venue:
  • ISPASS '11 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Today's information is increasing rapidly, doubling every three years. Consequently, the search and recognition stages in computer applications will consume a growing portion of the total CPU time. The SSE 4.2 instruction set, first implemented in Intel's Core i7, provides string and text processing instructions (STTNI) that utilize SIMD operations for processing character data. Though originally conceived for accelerating string, text, and XML processing, the powerful new capabilities of these instructions are useful outside of these domains, and it is worth revisiting the search and recognition stages of numerous applications to utilize STTNI to improve performance. In this paper, we explored the feasibility and potential benefit of using STTNI to improve the CPU and memory performance of search-and-recognition applications. We optimized four benchmark applications--cache simulation, B+tree search algorithm, template matching, Basic Local Alignment Search Tool (BLAST)--with STTNI, and the new applications outperform their respective original implementations by a factor of 1.4脳 to 13脳.