Effective distributed scheduling of parallel workloads
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Using name-based mappings to increase hit rates
IEEE/ACM Transactions on Networking (TON)
A High Throughput String Matching Architecture for Intrusion Detection and Prevention
Proceedings of the 32nd annual international symposium on Computer Architecture
A Scalable Architecture For High-Throughput Regular-Expression Pattern Matching
Proceedings of the 33rd annual international symposium on Computer Architecture
Algorithms to accelerate multiple regular expressions matching for deep packet inspection
Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications
Efficient memory utilization on network processors for deep packet inspection
Proceedings of the 2006 ACM/IEEE symposium on Architecture for networking and communications systems
Advanced algorithms for fast and scalable deep packet inspection
Proceedings of the 2006 ACM/IEEE symposium on Architecture for networking and communications systems
Fast and memory-efficient regular expression matching for deep packet inspection
Proceedings of the 2006 ACM/IEEE symposium on Architecture for networking and communications systems
Thread clustering: sharing-aware scheduling on SMP-CMP-SMT multiprocessors
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
The Contract Net Protocol: High-Level Communication and Control in a Distributed Problem Solver
IEEE Transactions on Computers
Performance scalability of a multi-core web server
Proceedings of the 3rd ACM/IEEE Symposium on Architecture for networking and communications systems
Compiling PCRE to FPGA for accelerating SNORT IDS
Proceedings of the 3rd ACM/IEEE Symposium on Architecture for networking and communications systems
A scalable multithreaded L7-filter design for multi-core servers
Proceedings of the 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
MultiLayer processing - an execution model for parallel stateful packet processing
Proceedings of the 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Hash routing for collections of shared Web caches
IEEE Network: The Magazine of Global Internetworking
Packet scheduling for deep packet inspection on multi-core architectures
Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
E-AHRW: An Energy-Efficient Adaptive Hash Scheduler for Stream Processing on Multi-core Servers
Proceedings of the 2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems
Task optimization based on CPU pipeline technique in a multicore system
Computers & Mathematics with Applications
An efficient parallelized L7-filter design for multicore servers
IEEE/ACM Transactions on Networking (TON)
Hi-index | 0.00 |
Ubiquitous multi-core-based web servers and edge routers are increasingly popular in deploying computationally intensive Deep Packet Inspection (DPI) programs. Previous work has shown the benefits of connection locality-based scheduling on multi-core servers to improve L7-filter performance. However, we show that highly threaded hierarchical multi-core processors, such as the Sun Niagara 2 processor, accumulate imbalanced workload at each resource layer. This workload imbalance potentially offsets the benefits from connection locality. In addition, connection-locality-based load balance fails to work when network traffic is unevenly distributed. In this paper, we propose an adaptive hash-based multilayer scheduler for a highly threaded hierarchical Sun Niagara 2 server. Our scheduler maintains connection locality and adaptively adjusts the scheduling to balance the real time workload. The original Highest Random Weight (HRW) hash guarantees the connection locality but only balances the workload over the number of different connections. We enhance the original single layer HRW into a hierarchical "hash tree" scheduler to balance the connection workload in accordance with the hierarchical processor architecture. We then optimize our multilayer scheduler to adaptively adjust scheduling decisions based on service time at each level, further improving the system load balance. Our scheduler is shown to increase the system throughput by 59.2% compared to the previously proposed connection locality optimization.