Supervisory control of a class of discrete event processes
SIAM Journal on Control and Optimization
On observability of discrete-event systems
Information Sciences: an International Journal - Robotics and Automation/Control Series
Verification of workflow task structures: A petri-net-based approach
Information Systems
Feedback Control of Computing Systems
Feedback Control of Computing Systems
Rx: treating bugs as allergies---a safe method to survive software failures
Proceedings of the twentieth ACM symposium on Operating systems principles
PlanetLab application management using plush
ACM SIGOPS Operating Systems Review
Thorough static analysis of device drivers
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
On the road to recovery: restoring data after disasters
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Diagnosis of Discrete Event Systems Using Decentralized Architectures
Discrete Event Dynamic Systems
Reducing the cost of IT operations: is automation always the answer?
HOTOS'05 Proceedings of the 10th conference on Hot Topics in Operating Systems - Volume 10
Understanding and dealing with operator mistakes in internet services
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Using model checking to find serious file system errors
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Enhancing server availability and security through failure-oblivious computing
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Why do internet services fail, and what can be done about it?
USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Integrated scientific workflow management for the Emulab network testbed
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
A root cause localization model for large scale systems
HotDep'05 Proceedings of the First conference on Hot topics in system dependability
WofBPEL: a tool for automated analysis of BPEL processes
ICSOC'05 Proceedings of the Third international conference on Service-Oriented Computing
The theory of deadlock avoidance via discrete control
Proceedings of the 36th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Bridging the gap: Discrete-Event Systems for software engineering (short position paper)
C3S2E '09 Proceedings of the 2nd Canadian Conference on Computer Science and Software Engineering
A language approach to discrete control in computing
Proceedings of the Fifth International Workshop on Feedback Control Implementation and Design in Computing Systems and Networks
Gadara: dynamic deadlock avoidance for multithreaded programs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Discrete Control for the Coordination of Administration Loops
UCC '12 Proceedings of the 2012 IEEE/ACM Fifth International Conference on Utility and Cloud Computing
Eliminating concurrency bugs in multithreaded software: an approach based on control of petri nets
PETRI NETS'13 Proceedings of the 34th international conference on Application and Theory of Petri Nets and Concurrency
Coordinating multiple administration loops using discrete control
ACM SIGOPS Operating Systems Review
Coordinating self-sizing and self-repair managers for multi-tier systems
Future Generation Computer Systems
Hi-index | 0.00 |
As information technology (IT) administration becomes increasingly complex, workflow technologies are gaining popularity for IT automation. Writing correct workflow programs is notoriously difficult. Although static analysis tools are available, fixing defects remains manual and error-prone. This paper applies discrete control theory to IT automation workflows. Discrete control detects flaws in workflows just as static analysis does, and more importantly it also allows safe execution of flawed workflows by dynamically avoiding run-time failures. Our approach can guarantee compliance with certain requirements and can partially decouple requirements from software, reducing the need to modify the latter if the former change. We have implemented a discrete control module for a real IT automation system. Experiments with workflows from a real production system and with randomly generated workflows show that our approach scales to workflows of practical size.