Compiler support for software-based cache partitioning
LCTES '95 Proceedings of the ACM SIGPLAN 1995 workshop on Languages, compilers, & tools for real-time systems
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Evaluation of multithreaded uniprocessors for commercial application environments
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
ICS '90 Proceedings of the 4th international conference on Supercomputing
Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment
Journal of the ACM (JACM)
Thread-level parallelism and interactive performance of desktop applications
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Real-Time Systems
Hard Real-Time Computing Systems: Predictable Scheduling Algorithms and Applications
Hard Real-Time Computing Systems: Predictable Scheduling Algorithms and Applications
Deadline Scheduling for Real-Time Systems: Edf and Related Algorithms
Deadline Scheduling for Real-Time Systems: Edf and Related Algorithms
A survey of processors with explicit multithreading
ACM Computing Surveys (CSUR)
Real-time scheduling on multithreaded processors
RTCSA '00 Proceedings of the Seventh International Conference on Real-Time Systems and Applications
Integrating the timing analysis of pipelining and instruction caching
RTSS '95 Proceedings of the 16th IEEE Real-Time Systems Symposium
Techniques for Software Thread Integration in Real-Time Embedded Systems
RTSS '98 Proceedings of the IEEE Real-Time Systems Symposium
Soft Real- Time Scheduling on Simultaneous Multithreaded Processors
RTSS '02 Proceedings of the 23rd IEEE Real-Time Systems Symposium
Virtual simple architecture (VISA): exceeding the complexity limit in safe real-time systems
Proceedings of the 30th annual international symposium on Computer architecture
Virtual multiprocessor: an analyzable, high-performance architecture for real-time computing
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
A case study of multi-threading in the embedded space
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Hardware support for WCET analysis of hard real-time multicore systems
Proceedings of the 36th annual international symposium on Computer architecture
Proceedings of the 2010 ACM Symposium on Applied Computing
MIPS MT: a multithreaded RISC architecture for embedded real-time processing
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Timing effects of DDR memory systems in hard real-time multicore architectures: Issues and solutions
ACM Transactions on Embedded Computing Systems (TECS) - Special section on ESTIMedia'12, LCTES'11, rigorous embedded systems design, and multiprocessor system-on-chip for cyber-physical systems
Hi-index | 0.00 |
A coarse-grain multithreaded processor can effectively hide long memory latencies by quickly switching to an alternate task when the active task issues a memory request, improving overall throughput. However, dynamic switching cannot be safely exploited to improve throughput in hard-real-time embedded systems. The schedulability of a task-set (guaranteeing all tasks meet deadlines) must be determined a priori using offline schedulability tests. Any computation/memory overlap must be statically accounted for. We develop a novel analytical framework that bounds the overlap between computation of a pipeline-resident-task and on-going memory transfers of other tasks. A simple closed-form schedulability test is derived, that only depends on the aggregate computation (C) and memory (M) components of tasks. Namely, the technique does not require specificity regarding the location of memory transfers within and among tasks and avoids searching all task permutations for a specific feasible schedule. To the best of our knowledge, this is the first work to provide the necessary formalism for safely and tractably exploiting coarse-grain multithreaded processors to tolerate memory latency in hard-real-time systems, exceeding the schedulability limits of classic real-time theory for uniprocessors. Our techniques make it possible to capitalize on higher frequency embedded processors, despite the widening processor-memory speed gap. Experiments with task-sets from C-lab benchmarks reveal improvement in the schedulability of task-sets, measured as the ability to schedule previously infeasible task-sets or reduce utilization for already feasible task-sets. We also demonstrate proof-of-concept by deploying our method in a cycle-level simulator of an ARM11-like embedded microprocessor augmented with multiple register contexts, the same hardware multithreading support available in Ubicom's IP3023 embedded microprocessor.