Real-time scheduling of mixture-of-experts systems with limited resources

  • Authors:
  • Prapaporn Rattanatamrong;Jose A.B. Fortes

  • Affiliations:
  • University of Florida, Gainesville, FL, USA;University of Florida, Gainesville, FL, USA

  • Venue:
  • Proceedings of the 13th ACM international conference on Hybrid systems: computation and control
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mixture-of-Experts (MoE) systems solve intricate problems by combining results generated independently by multiple computational models (the "experts"). Given an instance of a problem, the responsibility of an expert measures the degree to which the expert's output contributes to the final solution. Brain Machine Interfaces are examples of applications where an MoE system needs to run periodically and expert responsibilities can vary across execution cycles. When resources are insufficient to run all experts in every cycle, it becomes necessary to execute the most responsible experts within each cycle. The problem of adaptively scheduling experts with dynamic responsibilities can be formulated as a succession of optimization problems. Each of these problems can be solved by a known technique called "task compression" using explicit mappings described in this paper to relate expert responsibilities to task elasticities. A novel heuristic is proposed to enable real-time execution rate adaptation in MoE systems with insufficient resources. In any given cycle, the heuristic uses sensitivity analysis to test whether one of two pre-computed schedules is the optimal solution of the optimization problem to avoid re-optimization when the test result is positive. These two candidate schedules are the schedule used in the previous cycle and the schedule pre-computed by the heuristic during the previous cycle, using future responsibilities predicted by the heuristic's responsibility predictor. Our heuristic significantly reduces the scheduling delay in the execution of experts when re-execution of the task-compression algorithm is not needed from O(N2) time, where N denotes the number of experts, to O(N) time. Experimental evaluation of the heuristic on a test case in motor control shows that these time savings occur and scheduled experts' deadlines are met in up to 90% of all cycles. For the test scenario considered in the paper, the average output error of a real-time MoE system due to the use of limited resources is less than 7%.