On combining decisions from multiple expert imitators for performance

Authors:
Jonathan Rubin;Ian Watson
Affiliations:
Department of Computer Science, University of Auckland, Auckland, New Zealand;Department of Computer Science, University of Auckland, Auckland, New Zealand
Venue:
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume One
Year:
2011

Citing 8
Cited 2

Case-based reasoning

Case-based reasoning
Finite-time Analysis of the Multiarmed Bandit Problem

Machine Learning
Editorial

Artificial Intelligence Review - Special issue on lazy learning
Ensemble Methods in Machine Learning

MCS '00 Proceedings of the First International Workshop on Multiple Classifier Systems
Learning for control from multiple demonstrations

Proceedings of the 25th international conference on Machine learning
An Active Approach to Automatic Case Generation

ICCBR '09 Proceedings of the 8th International Conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
Monte-Carlo Tree Search in Poker Using Expected Reward Distributions

ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
Similarity-Based retrieval and solution re-use policies in the game of texas hold'em

ICCBR'10 Proceedings of the 18th international conference on Case-Based Reasoning Research and Development

Case-based strategies in computer poker

AI Communications
Online implicit agent modelling

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

One approach for artificially intelligent agents wishing to maximise some performance metric in a given domain is to learn from a collection of training data that consists of actions or decisions made by some expert, in an attempt to imitate that expert's style. We refer to this type of agent as an expert imitator. In this paper we investigate whether performance can be improved by combining decisions from multiple expert imitators. In particular, we investigate two existing approaches for combining decisions. The first approach combines decisions by employing ensemble voting between multiple expert imitators. The second approach dynamically selects the best imitator to use at runtime given the performance of the imitators in the current environment. We investigate these approaches in the domain of computer poker. In particular, we create expert imitators for limit and no limit Texas Hold'em and determine whether their performance can be improved by combining their decisions using the two approaches listed above.