Ubiquitous audio: capturing spontaneous collaboration
CSCW '92 Proceedings of the 1992 ACM conference on Computer-supported cooperative work
Presiding over accidents: system direction of human action
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Augmenting conversations using dual-purpose speech
Proceedings of the 17th annual ACM symposium on User interface software and technology
Individual differences in multimodal integration patterns: what are they and why do they exist?
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Speech pen: predictive handwriting based on ambient multimodal recognition
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Linearity and synchrony: quantitative metrics for slide-based presentation methodology
International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
Wearable MC system a system for supporting MC performances using wearable computing technologies
Proceedings of the 2nd Augmented Human International Conference
Presentation support system providing the function of promoting comment collection in limited time
KICSS'10 Proceedings of the 5th international conference on Knowledge, information, and creativity support systems
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
HyperSlides: dynamic presentation prototyping
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Hi-index | 0.00 |
In this paper we present a presentation training system that observes a presentation rehearsal and provides the speaker with recommendations for improving the delivery of the presentation, such as to speak more slowly and to look at the audience. Our system "Presentation Sensei" is equipped with a microphone and camera to analyze a presentation by combining speech and image processing techniques. Based on the results of the analysis, the system gives the speaker instant feedback with respect to the speaking rate, eye contact with the audience, and timing. It also alerts the speaker when some of these indices exceed predefined warning thresholds. After the presentation, the system generates visual summaries of the analysis results for the speaker's self-examinations. Our goal is not to improve the content on a semantic level, but to improve the delivery of it by reducing inappropriate basic behavior patterns. We asked a few test users to try the system and they found it very useful for improving their presentations. We also compared the system's output with the observations of a human evaluator. The result shows that the system successfully detected some inappropriate behavior. The contribution of this work is to introduce a practical recognition-based human training system and to show its feasibility despite the limitations of state-of-the-art speech and video recognition technologies.