Evaluation metrics and methodologies for user-centered evaluation of intelligent systems

Authors:
Jean Scholtz;Emile Morse;Michelle Potts Steves
Affiliations:
Pacific Northwest National Laboratory, P.O. Box 999, Richland, WA 99352, USA;National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, MD 20899, USA;National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, MD 20899, USA
Venue:
Interacting with Computers
Year:
2006

Citing 5
Cited 1

Usability inspection methods

Usability inspection methods
Evaluating usability evaluation techniques

ACM Computing Surveys (CSUR) - Special issue: position statements on strategic directions in computing research
A Practical Guide to Usability Testing

A Practical Guide to Usability Testing
Glass Box: An Instrumented Infrastructure for Supporting Human Interaction with Information

HICSS '05 Proceedings of the Proceedings of the 38th Annual Hawaii International Conference on System Sciences - Volume 09
Which comes first, usability or utility?

Proceedings of the 14th IEEE Visualization 2003 (VIS'03)

The CACHE Study: Group Effects in Computer-supported Collaborative Analysis

Computer Supported Cooperative Work

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the past four years, we have worked with several research programs that were developing intelligent software for use by intelligence analysts. Our involvement in these programs was to develop the metrics and methodologies for assessing the impact on users; in this case, on intelligence analysts. In particular, we focused on metrics to evaluate how much the intelligent systems contribute to the users' tasks and what the cost is to the user in terms of workload and process deviations. In this paper, we describe the approach used. We started with two types of preliminary investigations - first, collecting and analyzing data from analysts working in an instrumented environment for a period of 2 years, and second, developing and conducting formative evaluations of research software. The long-term studies informed our ideas about the processes that analysts use and provided potential metrics in an environment without intelligent software tools. The formative evaluations helped us to define sets of application-specific metrics. Finally, we conducted assessments during and after technology insertions. We describe the metrics and methodologies used in each of these activities, along with the lessons learned.