An efficient context-free parsing algorithm
Communications of the ACM
PADS: a domain-specific language for processing ad hoc data
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
From dirt to shovels: fully automatic tool generation from ad hoc data
Proceedings of the 35th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Proceedings of the 14th International Conference on Database Theory
Bistro data feed management system
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Mining temporal invariants from partially ordered logs
SLAML '11 Managing Large-scale Systems via the Analysis of System Logs and the Application of Machine Learning Techniques
LogSig: generating system events from raw textual logs
Proceedings of the 20th ACM international conference on Information and knowledge management
Mining temporal invariants from partially ordered logs
ACM SIGOPS Operating Systems Review
LearnPADS++: incremental inference of ad hoc data formats
PADL'12 Proceedings of the 14th international conference on Practical Aspects of Declarative Languages
Hi-index | 0.00 |
System logs come in a large and evolving variety of formats, many of which are semi-structured and/or non-standard. As a consequence, off-the-shelf tools for processing such logs often do not exist, forcing analysts to develop their own tools, which is costly and timeconsuming. In this paper, we present an incremental algorithm that automatically infers the format of system log files. From the resulting format descriptions, we can generate a suite of data processing tools automatically. The system can handle large-scale data sources whose formats evolve over time. Furthermore, it allows analysts to modify inferred descriptions as desired and incorporates those changes in future revisions.