With microscope and tweezers: the worm from MIT's perspective
Communications of the ACM
A static analyzer for finding dynamic programming errors
Software—Practice & Experience
Further analysis of a computing center environment
Communications of the ACM
Bugs as deviant behavior: a general approach to inferring errors in systems code
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
The SLAM project: debugging system software via static analysis
POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Bug isolation via remote program sampling
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Automating Software Failure Reporting
Queue - System Failures
Crash Data Collection: A Windows Case Study
DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
Vigilante: end-to-end containment of internet worms
Proceedings of the twentieth ACM symposium on Operating systems principles
Rx: treating bugs as allergies---a safe method to survive software failures
Proceedings of the twentieth ACM symposium on Operating systems principles
Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Thorough static analysis of device drivers
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Enhancing server availability and security through failure-oblivious computing
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Scrash: a system for generating secure crash information
SSYM'03 Proceedings of the 12th conference on USENIX Security Symposium - Volume 12
Windows XP kernel crash analysis
LISA '06 Proceedings of the 20th conference on Large Installation System Administration
Triage: diagnosing production run failures at the user's site
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Better bug reporting with better privacy
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
AIEE-IRE '51 Papers and discussions presented at the Dec. 10-12, 1951, joint AIEE-IRE computer conference: Review of electronic digital computers
Deadlock immunity: enabling systems to defend against deadlocks
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Formal specifications on industrial-strength code—from myth to reality
CAV'06 Proceedings of the 18th international conference on Computer Aided Verification
Fast byte-granularity software fault isolation
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Dynamically checking ownership policies in concurrent c/c++ programs
Proceedings of the 37th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
SherLog: error diagnosis by connecting clues from run-time logs
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Fingerprinting the datacenter: automated classification of performance crises
Proceedings of the 5th European conference on Computer systems
Towards versatile performance models for complex, popular applications
ACM SIGMETRICS Performance Evaluation Review
A query language for understanding component interactions in production systems
Proceedings of the 24th ACM International Conference on Supercomputing
Practical performance models for complex, popular applications
Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Mugshot: deterministic capture and replay for Javascript applications
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
Tolerating malicious device drivers in Linux
USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
Proceedings of the 5th international symposium on Software visualization
Low-overhead bug fingerprinting for fast debugging
RV'10 Proceedings of the First international conference on Runtime verification
Improving software diagnosability via log enhancement
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Advancing the state of home networking
Communications of the ACM
Striking a new balance between program instrumentation and debugging time
Proceedings of the sixth conference on Computer systems
Cycles, cells and platters: an empirical analysisof hardware failures on a million consumer PCs
Proceedings of the sixth conference on Computer systems
Characterizing the differences between pre- and post- release versions of software
Proceedings of the 33rd International Conference on Software Engineering
If the SOK fits, wear it: pragmatic process improvement through software operation knowledge
PROFES'11 Proceedings of the 12th international conference on Product-focused software process improvement
Fay: extensible distributed tracing from kernels to clusters
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Catch me if you can: performance bug detection in the wild
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Software analytics as a learning case in practice: approaches and experiences
Proceedings of the International Workshop on Machine Learning Technologies in Software Engineering
The power of propagation: on the role of software operation knowledge within software ecosystems
Proceedings of the International Conference on Management of Emergent Digital EcoSystems
RTRP: right time right place kernel analysis tool
Proceedings of the 2011 ACM Symposium on Research in Applied Computation
Improving Software Diagnosability via Log Enhancement
ACM Transactions on Computer Systems (TOCS) - Special Issue APLOS 2011
Data races vs. data race bugs: telling the difference with portend
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
scc: cluster storage provisioning informed by application characteristics and SLAs
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
Performance debugging in the large via mining millions of stack traces
Proceedings of the 34th International Conference on Software Engineering
Information needs for software development analytics
Proceedings of the 34th International Conference on Software Engineering
Characterizing and predicting which bugs get reopened
Proceedings of the 34th International Conference on Software Engineering
ReBucket: a method for clustering duplicate crash reports based on call stack similarity
Proceedings of the 34th International Conference on Software Engineering
Rivet: browser-agnostic remote debugging for web applications
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Predicting recurring crash stacks
Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Fay: Extensible Distributed Tracing from Kernels to Clusters
ACM Transactions on Computer Systems (TOCS)
CoRD: a collaborative framework for distributed data race detection
HotDep'12 Proceedings of the Eighth USENIX conference on Hot Topics in System Dependability
Collaborative energy debugging for mobile devices
HotDep'12 Proceedings of the Eighth USENIX conference on Hot Topics in System Dependability
AppInsight: mobile app performance monitoring in the wild
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Be conservative: enhancing failure diagnosis with proactive logging
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Production-run software failure diagnosis via hardware performance counters
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Using likely invariants for automated software fault localization
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
More testers - The effect of crowd size and time restriction in software testing
Information and Software Technology
Chronicler: lightweight recording to reproduce field failures
Proceedings of the 2013 International Conference on Software Engineering
Automated debugging for arbitrarily long executions
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
Interactive record/replay for web application debugging
Proceedings of the 26th annual ACM symposium on User interface software and technology
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Carat: collaborative energy diagnosis for mobile devices
Proceedings of the 11th ACM Conference on Embedded Networked Sensor Systems
VirtuOS: an operating system with kernel virtualization
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
RaceMob: crowdsourced data race detection
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
DeltaPath: Precise and Scalable Calling Context Encoding
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Hi-index | 0.02 |
Windows Error Reporting (WER) is a distributed system that automates the processing of error reports coming from an installed base of a billion machines. WER has collected billions of error reports in ten years of operation. It collects error data automatically and classifies errors into buckets, which are used to prioritize developer effort and report fixes to users. WER uses a progressive approach to data collection, which minimizes overhead for most reports yet allows developers to collect detailed information when needed. WER takes advantage of its scale to use error statistics as a tool in debugging; this allows developers to isolate bugs that could not be found at smaller scale. WER has been designed for large scale: one pair of database servers can record all the errors that occur on all Windows computers worldwide.