Presentation sensei: a presentation training system using speech and image processing

Authors:
Kazutaka Kurihara;Masataka Goto;Jun Ogata;Yosuke Matsusaka;Takeo Igarashi
Affiliations:
National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan;National Institute of Advanced Industrial Science and Technology (AIST), Ibaraki, Japan;National Institute of Advanced Industrial Science and Technology (AIST), Ibaraki, Japan;National Institute of Advanced Industrial Science and Technology (AIST), Ibaraki, Japan;The University of Tokyo, Tokyo, Japan
Venue:
Proceedings of the 9th international conference on Multimodal interfaces
Year:
2007

Citing 5
Cited 5

Ubiquitous audio: capturing spontaneous collaboration

CSCW '92 Proceedings of the 1992 ACM conference on Computer-supported cooperative work
Presiding over accidents: system direction of human action

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Augmenting conversations using dual-purpose speech

Proceedings of the 17th annual ACM symposium on User interface software and technology
Individual differences in multimodal integration patterns: what are they and why do they exist?

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Speech pen: predictive handwriting based on ambient multimodal recognition

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

Linearity and synchrony: quantitative metrics for slide-based presentation methodology

International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
Wearable MC system a system for supporting MC performances using wearable computing technologies

Proceedings of the 2nd Augmented Human International Conference
Presentation support system providing the function of promoting comment collection in limited time

KICSS'10 Proceedings of the 5th international conference on Knowledge, information, and creativity support systems
The design and field observation of a haptic notification system for timing awareness during oral presentations

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
HyperSlides: dynamic presentation prototyping

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present a presentation training system that observes a presentation rehearsal and provides the speaker with recommendations for improving the delivery of the presentation, such as to speak more slowly and to look at the audience. Our system "Presentation Sensei" is equipped with a microphone and camera to analyze a presentation by combining speech and image processing techniques. Based on the results of the analysis, the system gives the speaker instant feedback with respect to the speaking rate, eye contact with the audience, and timing. It also alerts the speaker when some of these indices exceed predefined warning thresholds. After the presentation, the system generates visual summaries of the analysis results for the speaker's self-examinations. Our goal is not to improve the content on a semantic level, but to improve the delivery of it by reducing inappropriate basic behavior patterns. We asked a few test users to try the system and they found it very useful for improving their presentations. We also compared the system's output with the observations of a human evaluator. The result shows that the system successfully detected some inappropriate behavior. The contribution of this work is to introduce a practical recognition-based human training system and to show its feasibility despite the limitations of state-of-the-art speech and video recognition technologies.