Operations Research
Sampling and integration of near log-concave functions
STOC '91 Proceedings of the twenty-third annual ACM symposium on Theory of computing
Learning models of other agents using influence diagrams
UM '99 Proceedings of the seventh international conference on User modeling
Bayesian Networks and Decision Graphs
Bayesian Networks and Decision Graphs
A Guide to the Literature on Learning Probabilistic Networks from Data
IEEE Transactions on Knowledge and Data Engineering
Learning an Agent's Utility Function by Observing Behavior
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Algorithms for Inverse Reinforcement Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Utilities as Random Variables: Density Estimation and Structure Discovery
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Making Rational Decisions Using Adaptive Utility Elicitation
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Dynamic Programming
Representing and Solving Decision Problems with Limited Information
Management Science
Lazy evaluation of symmetric Bayesian decision problems
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Welldefined decision scenarios
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Efficient value of information computation
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Utility elicitation as a classification problem
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
A measure of decision flexibility
UAI'96 Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence
From influence diagrams to junction trees
UAI'94 Proceedings of the Tenth international conference on Uncertainty in artificial intelligence
Sensitivity analysis in influence diagrams
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Modeling challenges with influence diagrams: Constructing probability and utility models
Decision Support Systems
Computing rank dependent utility in graphical models for sequential decision problems
Artificial Intelligence
Hi-index | 0.00 |
When modeling a decision problem using the influence diagram framework, the quantitative part rests on two principal components: probabilities for representing the decision maker's uncertainty about the domain and utilities for representing preferences. Over the last decade, several methods have been developed for learning the probabilities from a database. However, methods for learning the utilities have only received limited attention in the computer science community. A promising approach for learning a decision maker's utility function is to take outset in the decision maker's observed behavioral patterns, and then find a utility function which (together with a domain model) can explain this behavior. That is, it is assumed that decision maker's preferences are reflected in the behavior. Standard learning algorithms also assume that the decision maker is behavioral consistent, i.e., given a model of the decision problem, there exists a utility function which can account for all the observed behavior. Unfortunately, this assumption is rarely valid in real-world decision problems, and in these situations existing learning methods may only identify a trivial utility function. In this paper we relax this consistency assumption, and propose two algorithms for learning a decision maker's utility function from possibly inconsistent behavior; inconsistent behavior is interpreted as random deviations from an underlying (true) utility function. The main difference between the two algorithms is that the first facilitates a form of batch learning whereas the second focuses on adaptation and is particularly well-suited for scenarios where the DM's preferences change over time. Empirical results demonstrate the tractability of the algorithms, and they also show that the algorithms converge toward the true utility function for even very small sets of observations.