An approach to feature location in distributed systems

  • Authors:
  • Dennis Edwards;Sharon Simmons;Norman Wilde

  • Affiliations:
  • Department of Computer Science, University of West Florida, 11000 University Parkway, Pensacola, FL 32514, USA;Department of Computer Science, University of West Florida, 11000 University Parkway, Pensacola, FL 32514, USA;Department of Computer Science, University of West Florida, 11000 University Parkway, Pensacola, FL 32514, USA

  • Venue:
  • Journal of Systems and Software
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes an approach to the feature location problem for distributed systems, that is, to the problem of locating which code components are important in providing a particular feature for an end user. A feature is located by observing system execution and noting time intervals in which it is active. Traces of execution in intervals with and without the feature are compared. Earlier experience has shown that this analysis is difficult because distributed systems often exhibit stochastic behavior and because time intervals are hard to identify with precision. To get around these difficulties, the paper proposes a definition of time interval based on the causality analysis introduced by Lamport and others. A strict causal interval may be defined, but it must often be extended to capture latent events and to represent the inherent imprecision in time measurement. This extension is modeled using a weighting function which may be customized to the specific circumstances of each study. The end result of the analysis is a component relevance index, denoted p"c, which can be used to measure the relevance of a software component to a particular feature. Software engineers may focus their analysis efforts on the top components as ranked according to p"c. Two case studies are presented. The first study demonstrates the feasibility of p"c by applying our method to a well-defined distributed system. The second study demonstrates the versatility of p"c by applying our method to message logs obtained from a large military system. Both studies indicate that the suggested approach could be an effective guide for a software engineer who is maintaining or enhancing a distributed system.