The evaluator effect in usability tests
CHI 98 Cconference Summary on Human Factors in Computing Systems
On the reliability of usability testing
CHI '01 Extended Abstracts on Human Factors in Computing Systems
Reconditioned merchandise: extended structured report formats in usability inspection
CHI '04 Extended Abstracts on Human Factors in Computing Systems
Comparative usability evaluation
Behaviour & Information Technology
Heuristic evaluation for games: usability principles for video game design
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Comparison of techniques for matching of usability problem descriptions
Interacting with Computers
Game Usability Heuristics (PLAY) for Evaluating and Designing Better Games: The Next Iteration
OCSC '09 Proceedings of the 3d International Conference on Online Communities and Social Computing: Held as Part of HCI International 2009
Expert review method in game evaluations: comparison of two playability heuristic sets
Proceedings of the 13th International MindTrek Conference: Everyday Life in the Ubiquitous Era
HCI'07 Proceedings of the 12th international conference on Human-computer interaction: interaction design and usability
Usability testing for serious games: making informed design decisions with user data
Advances in Human-Computer Interaction - Special issue on User Assessment in Serious Games and Technology-Enhanced Learning
Hi-index | 0.00 |
Heuristic evaluation promises to be a low-cost usability evaluation method, but is fraught with problems of subjective interpretation, and a proliferation of competing and contradictory heuristic lists. This is particularly true in the field of games research where no rigorous comparative validation has yet been published. In order to validate the available heuristics, a user test of a commercial game is conducted with 6 participants in which 88 issues are identified, against which 146 heuristics are rated for relevance by 3 evaluators. Weak inter-rater reliability is calculated with Krippendorff's Alpha of 0.343, refuting validation of any of the available heuristics. This weak reliability is due to the high complexity of video games, resulting in evaluators interpreting different reasonable causes and solutions for the issues, and hence the wide variance in their ratings of the heuristics.