A Dynamic Principal-Agent Model with Hidden Information: Sequential Optimality Through Truthful State Revelation

Authors:
Hao Zhang;Stefanos Zenios
Affiliations:
Marshall School of Business, University of Southern California, Los Angeles, California 90089;Graduate School of Business, Stanford University, Stanford, California 94305
Venue:
Operations Research
Year:
2008

Citing 0
Cited 7

Policy teaching through reward function learning

Proceedings of the 10th ACM conference on Electronic commerce
Dynamic Supplier Contracts Under Asymmetric Inventory Information

Operations Research
Optimal Selling Scheme for Heterogeneous Consumers with Uncertain Valuations

Mathematics of Operations Research
Analysis of a Dynamic Adverse Selection Model with Asymptotic Efficiency

Mathematics of Operations Research
Solving an Infinite Horizon Adverse Selection Model Through Finite Policy Graphs

Operations Research
Collaborative Cost Reduction and Component Procurement Under Information Asymmetry

Management Science
Mechanism Design for Capacity Planning Under Dynamic Evolutions of Asymmetric Demand Forecasts

Management Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a general framework for a large class of multiperiod principal-agent problems. In this framework, a principal has a primary stake in the performance of a system but delegates its control to an agent. The underlying system is a Markov decision process, where the state of the system can only be observed by the agent but the agent's action is observed by both parties. This paper develops a dynamic programming algorithm to derive optimal long-term contracts for the principal. The principal indirectly controls the underlying system by offering the agent a menu of continuation utility vectors along public information paths; the agent's best response, expressed in his choice of continuation utilities, induces truthful state revelation and results in actions that maximize the principal's expected payoff. This problem is meaningful to the operations research community because it can be framed as the problem of optimally designing the reward structure of a Markov decision process with hidden states and has many applications of interest as discussed in this paper.