Optimizing data analysis with a semi-structured time series database

Authors:
Ledion Bitincka;Archana Ganapathi;Stephen Sorkin;Steve Zhang
Affiliations:
Splunk Inc.;Splunk Inc.;Splunk Inc.;Splunk Inc.
Venue:
SLAML'10 Proceedings of the 2010 workshop on Managing systems via log analysis and machine learning techniques
Year:
2010

Citing 3
Cited 5

Readings in Database Systems: Fourth Edition

Readings in Database Systems: Fourth Edition
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
LearnPADS: automatic tool generation from ad hoc data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data

Web analytics and the art of data summarization

SLAML '11 Managing Large-scale Systems via the Analysis of System Logs and the Application of Machine Learning Techniques
Bridging the divide between software developers and operators using logs

Proceedings of the 34th International Conference on Software Engineering
Experiences with workload management in splunk

Proceedings of the 2012 workshop on Management of big data systems
Building blocks for exploratory data analysis tools

Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics
DevOps patterns to scale web applications using cloud services

Proceedings of the 2013 companion publication for conference on Systems, programming, & applications: software for humanity

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most modern systems generate abundant and diverse log data. With dwindling storage costs, there are fewer reasons to summarize or discard data. However, the lack of tools to efficiently store and cross-correlate heterogeneous datasets makes it tedious to mine the data for analytic insights. In this paper, we present Splunk, a semi-structured time series database that can be used to index, search and analyze massive heterogeneous datasets. We share observations, lessons and case studies from real world datasets, and demonstrate Splunk's power and flexibility for enabling insightful data mining searches.