Pinpoint: Problem Determination in Large, Dynamic Internet Services
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Performance debugging for distributed systems of black boxes
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
ICAC '05 Proceedings of the Second International Conference on Automatic Computing
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Using magpie for request extraction and workload modelling
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
BorderPatrol: isolating events for black-box tracing
Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
Hunting for problems with Artemis
WASL'08 Proceedings of the First USENIX conference on Analysis of system logs
SALSA: analyzing logs as state machines
WASL'08 Proceedings of the First USENIX conference on Analysis of system logs
Mining console logs for large-scale system problem detection
SysML'08 Proceedings of the Third conference on Tackling computer systems problems with machine learning techniques
X-trace: a pervasive network tracing framework
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Detecting application-level failures in component-based Internet services
IEEE Transactions on Neural Networks
Ganesha: blackBox diagnosis of MapReduce systems
ACM SIGMETRICS Performance Evaluation Review
An Analysis of Traces from a Production MapReduce Cluster
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Synoptic: summarizing system logs with refinement
SLAML'10 Proceedings of the 2010 workshop on Managing systems via log analysis and machine learning techniques
ASDF: an automated, online framework for diagnosing performance problems
Architecting dependable systems VII
Otus: resource attribution in data-intensive clusters
Proceedings of the second international workshop on MapReduce and its applications
New ideas track: testing mapreduce-style programs
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Mining temporal invariants from partially ordered logs
SLAML '11 Managing Large-scale Systems via the Analysis of System Logs and the Application of Machine Learning Techniques
Understanding and improving the diagnostic workflow of MapReduce users
CHIMIT '11 Proceedings of the 5th ACM Symposium on Computer Human Interaction for Management of Information Technology
Mining temporal invariants from partially ordered logs
ACM SIGOPS Operating Systems Review
An improved partitioning mechanism for optimizing massive data analysis using MapReduce
The Journal of Supercomputing
Hi-index | 0.00 |
Mochi, a new visual, log-analysis based debugging tool correlates Hadoop's behavior in space, time and volume, and extracts a causal, unified control- and data-flow model of Hadoop across the nodes of a cluster. Mochi's analysis produces visualizations of Hadoop's behavior using which users can reason about and debug performance issues. We provide examples of Mochi's value in revealing a Hadoop job's structure, in optimizing real-world workloads, and in identifying anomalous Hadoop behavior, on the Yahoo! M45 Hadoop cluster.