Evaluation of an integrated multi-task machine learning system with humans in the loop

Authors:
Aaron Steinfeld;S. Rachael Bennett;Kyle Cunningham;Matt Lahut;Pablo-Alejandro Quinones;Django Wexler;Dan Siewiorek;Jordan Hayes;Paul Cohen;Julie Fitzgerald;Othar Hansson;Mike Pool;Mark Drummond
Affiliations:
Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Bitway, Inc.;U. of Southern California;JSF Consulting;Bitway, Inc.;IET;SRI International
Venue:
PerMIS '07 Proceedings of the 2007 Workshop on Performance Metrics for Intelligent Systems
Year:
2007

Citing 7
Cited 8

An adaptive stock tracker for personalized trading advice

Proceedings of the 8th international conference on Intelligent user interfaces
Simulation of a Vehicle Traffic Control Network Using a Fuzzy Classifier System

SS '02 Proceedings of the 35th Annual Simulation Symposium
Automatic extraction of titles from general documents using machine learning

Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Machine Learning for Clinical Diagnosis from Functional Magnetic Resonance Imaging

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
A hybrid learning system for recognizing user tasks from desktop activities and email messages

Proceedings of the 11th international conference on Intelligent user interfaces
PLOW: a collaborative task learning agent

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Survey measures for evaluation of cognitive assistants

PerMIS '07 Proceedings of the 2007 Workshop on Performance Metrics for Intelligent Systems

What to do when search fails: finding information by association

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
RADAR: a personal assistant that learns to reduce email overload

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
A demonstration of the RADAR personal assistant

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Survey measures for evaluation of cognitive assistants

PerMIS '07 Proceedings of the 2007 Workshop on Performance Metrics for Intelligent Systems
Agent-assisted task management that reduces email overload

Proceedings of the 15th international conference on Intelligent user interfaces
Software engineering in an uncertain world

Proceedings of the FSE/SDP workshop on Future of software engineering research
PTIME: Personalized assistance for calendaring

ACM Transactions on Intelligent Systems and Technology (TIST)
Detection of imperative and declarative question--answer pairs in email conversations

AI Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Performance of a cognitive personal assistant, RADAR, consisting of multiple machine learning components, natural language processing, and optimization was examined with a test explicitly developed to measure the impact of integrated machine learning when used by a human user in a real world setting. Three conditions (conventional tools, Radar without learning, and Radar with learning) were evaluated in a large-scale, between-subjects study. The study revealed that integrated machine learning does produce a positive impact on overall performance. This paper also discusses how specific machine learning components contributed to human-system performance.