Proceedings of the fourth international symposium on Integrated network management IV
Network management with Nagios
Linux Journal
Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions
Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions
InteMon: continuous mining of sensor data in large-scale self-infrastructures
ACM SIGOPS Operating Systems Review
Disk aware discord discovery: finding unusual time series in terabyte sized datasets
Knowledge and Information Systems
ACM Computing Surveys (CSUR)
Event-based applications and enabling technologies
Proceedings of the Third ACM International Conference on Distributed Event-Based Systems
A graphical editor for complex event pattern generation
Proceedings of the Third ACM International Conference on Distributed Event-Based Systems
Data stream anomaly detection through principal subspace tracking
Proceedings of the 2010 ACM Symposium on Applied Computing
Proceedings of the Fifth Balkan Conference in Informatics
Hi-index | 0.00 |
Supervisory processes are fundamental when running data center operations striving for fault resilience: any downtime can directly affect the business's income and definitely its reputation. Current monitoring tools rely on experts to configure constant thresholds on single streams, which is not appropriated for dynamic systems and insufficient to capture complex patterns. We present HOLMES, built to support data center experts to anticipate failures with a solution that combines Event Driven Architecture, Complex Event Processing and an unsupervised machine learning algorithm. Based on rules created by the users, the system continuously checks for known problems. Meanwhile, for the unknown ones, we leverage the CEP engine for aggregating and joining streams of real-time data to feed normalized input to FRAHST, our machine learning algorithm that detects anomalous patterns across multivariate numerical streams. We describe how the UI module also operates within the publish/subscribe paradigm to enhance situational awareness. The system had very well acceptance and was successfully implemented at one of the largest Internet Service Providers in South America.