Automated System Monitoring and Notification With Swatch

Authors:
Stephen E. Hansen;E. Todd Atkins
Affiliations:
Stanford University;Stanford University
Venue:
LISA '93 Proceedings of the 7th USENIX conference on System administration
Year:
1993

Citing 1
Cited 23

Programming perl

Programming perl

Platform independent tool for local event correlation

Acta Cybernetica
Getting More Work Out Of Work Tracking Systems

LISA '94 Proceedings of the 8th USENIX conference on System administration
Using Visualization in System and Network Administration

LISA '96 Proceedings of the 10th USENIX conference on System administration
Extensible, Scalable Monitoring for Clusters of Computers

LISA '97 Proceedings of the 11th USENIX conference on System administration
Automation of Site Configuration Management

LISA '97 Proceedings of the 11th USENIX conference on System administration
An NFS Configuration Management System and its Underlying Object-Oriented Model

LISA '98 Proceedings of the 12th USENIX conference on System administration
Computer Immunology

LISA '98 Proceedings of the 12th USENIX conference on System administration
Dealing with Public Ethernet Jacks - Switches, Gateways, and Authentication

LISA '99 Proceedings of the 13th USENIX conference on System administration
Gossips: System and Service Monitor

LISA '01 Proceedings of the 15th USENIX conference on System administration
MieLog: A Highly Interactive Visual Log Browser Using Information Visualization and Statistical Analysis

LISA '02 Proceedings of the 16th USENIX conference on System administration
Process Monitor: Detecting Events That Didn't Happen

LISA '02 Proceedings of the 16th USENIX conference on System administration
Refereed Papers: Real-time Log File Analysis Using the Simple Event Correlator (SEC)

LISA '04 Proceedings of the 18th USENIX conference on System administration
Dynamic syslog mining for network failure monitoring

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Auto-diagnosis of field problems in an appliance operating system

ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
Analyzing system logs: a new view of what's important

SYSML'07 Proceedings of the 2nd USENIX workshop on Tackling computer systems problems with machine learning techniques
An automated approach for abstracting execution logs to execution events

Journal of Software Maintenance and Evolution: Research and Practice - Special Issue on Program Comprehension through Dynamic Analysis (PCODA)
Detecting large-scale system problems by mining console logs

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Efficiently extracting operational profiles from execution logs using suffix arrays

ISSRE'09 Proceedings of the 20th IEEE international conference on software reliability engineering
Analysis of execution log files

Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 2
Mining invariants from console logs for system problem detection

USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
Log analysis and event correlation using variable temporal event correlator (VTEC)

LISA'10 Proceedings of the 24th international conference on Large installation system administration
Experience mining Google's production console logs

SLAML'10 Proceedings of the 2010 workshop on Managing systems via log analysis and machine learning techniques
Provenance for system troubleshooting

LISA'11 Proceedings of the 25th international conference on Large Installation System Administration

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes an approach to monitoring events on a large number of servers and workstations. While modern UNIX systems are capable of logging a variety of information concerning the health and status of their hardware and operating system software, they are generally not configured to do so. Even when this information is logged, it is often hidden in places that are either not monitored regularly or are susceptible to deletion or modification by a successful intruder. Also, a system administrator must often monitor several, perhaps dozens, of systems. To address these problems, our approach begins with the modification of certain system programs to enhance their logging capabilities. In addition, our approach calls for the logging facilities on each of these systems to be configured in such a way as to send a copy of the critical system and security related information to a dependable, secure, central logging host system. As one might expect, this central log can see a megabyte or more of data in a single day. To keep a system administrator from being overwhelmed by a large quantity of data we have developed an easily configurable log file filter/monitor, called swatch. Swatch monitors log files and acts to filter out unwanted data and take one or more user specified actions (ring bell, send mail, execute a script, etc.) based upon patterns in the log.