Decision support for safe AI design

Authors:
Bill Hibbard
Affiliations:
SSEC, University of Wisconsin, Madison, WI
Venue:
AGI'12 Proceedings of the 5th international conference on Artificial General Intelligence
Year:
2012

Citing 13
Cited 0

An introduction to Kolmogorov complexity and its applications (2nd ed.)

An introduction to Kolmogorov complexity and its applications (2nd ed.)
Super-intelligent machines

ACM SIGGRAPH Computer Graphics
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Learning Dynamic Bayesian Networks

Adaptive Processing of Sequences and Data Structures, International Summer School on Neural Networks, "E.R. Caianiello"-Tutorial Lectures
The VIS-5D system for easy interactive visualization

VIS '90 Proceedings of the 1st conference on Visualization '90
Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability

Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability
The Basic AI Drives

Proceedings of the 2008 conference on Artificial General Intelligence 2008: Proceedings of the First AGI Conference
Delusion, survival, and intelligent agents

AGI'11 Proceedings of the 4th international conference on Artificial general intelligence
Rational universal benevolence: simpler, safer, and Wiser than "friendly AI"

AGI'11 Proceedings of the 4th international conference on Artificial general intelligence
Learning what to value

AGI'11 Proceedings of the 4th international conference on Artificial general intelligence
The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents

Minds and Machines
Avoiding unintended AI behaviors

AGI'12 Proceedings of the 5th international conference on Artificial General Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

There is considerable interest in ethical designs for artificial intelligence (AI) that do not pose risks to humans. This paper proposes using elements of Hutter's agent-environment framework to define a decision support system for simulating, visualizing and analyzing AI designs to understand their consequences. The simulations do not have to be accurate predictions of the future; rather they show the futures that an agent design predicts will fulfill its motivations and that can be explored by AI designers to find risks to humans. In order to safely create a simulation model this paper shows that the most probable finite stochastic program to explain a finite history is finitely computable, and that there is an agent that makes such a computation without any unintended instrumental actions. It also discusses the risks of running an AI in a simulated environment.