Software connectors for highly distributed and voluminous data-intensive systems

  • Authors:
  • Nenad Medvidovic;Christian Alan Mattmann

  • Affiliations:
  • University of Southern California;University of Southern California

  • Venue:
  • Software connectors for highly distributed and voluminous data-intensive systems
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data-intensive systems and applications transfer large volumes of data and metadata to highly distributed users separated by geographic distance and organizational boundaries. A dominating factor in these large volume data transfers is the selection of the appropriate software connector that satisfies user constraints on the required data distribution scenarios. This task is typically accomplished by consulting “gurus'” who rely on their intuitions, at best backed by anecdotal evidence. In this dissertation we motivate, present and evaluate a software architecture-based systematic framework for selecting software connectors based on eight key dimensions of data distribution that we use to represent the data distribution scenarios. Our framework, dubbed DISCO, accurately, efficiently, and reliably captures a guru's domain knowledge and allows a user to automatically leverage that knowledge to drive connector selection. In addition, DISCO affords a user the ability to validate a guru's domain knowledge against actual performance measurements of the connectors in the areas of efficiency, scalability, dependability and consistency. We provide a set of models, algorithms, techniques and tools to represent data distribution scenarios, classify and select connectors and explore the trade off space when architecting large scale data distribution systems. To date, 13 real-world connectors across four connector families have been explored using our framework. We validate our framework empirically and qualitatively, employing 30 data distribution scenarios gleaned from three real-world projects spanning planetary science, cancer research and earth science at NASA's Jet Propulsion Laboratory. We use a number of measures of accuracy including precision, recall and error rate. We also provide theoretical performance analysis of our connector selection algorithms. We report empirical performance measurements of the 13 connectors and use the data to revise and validate our precision measurements. In addition to our validation, we have integrated DISCO as a “plug-in” piece to an independently-developed COTS interoperability assessment framework, providing more feedback for a second use-case of the tool. We conclude the dissertation with a set of open research questions that will frame our future work.