Randomness conservation inequalities; information and independence in mathematical theories
Information and Control
First-order logic and automated theorem proving (2nd ed.)
First-order logic and automated theorem proving (2nd ed.)
On Effective Procedures for Speeding Up Algorithms
Journal of the ACM (JACM)
The Speed Prior: A New Simplicity Measure Yielding Near-Optimal Computable Predictions
COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
Optimal Ordered Problem Solver
Machine Learning
Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability
Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Anticipatory Behavior in Adaptive Learning Systems
Guided self-organisation for autonomous robot development
ECAL'07 Proceedings of the 9th European conference on Advances in artificial life
DS'07 Proceedings of the 10th international conference on Discovery science
A family of Gödel machine implementations
AGI'11 Proceedings of the 4th international conference on Artificial general intelligence
Real-world limits to algorithmic intelligence
AGI'11 Proceedings of the 4th international conference on Artificial general intelligence
Hi-index | 0.01 |
We present the first class of mathematically rigorous, general, fully self-referential, self-improving, optimal reinforcement learning systems. Such a system rewrites any part of its own code as soon as it has found a proof that the rewrite is useful, where the problemdependent utility function and the hardware and the entire initial code are described by axioms encoded in an initial proof searcher which is also part of the initial code. The searcher systematically and efficiently tests computable proof techniques (programs whose outputs are proofs) until it finds a provably useful, computable self-rewrite. We show that such a self-rewrite is globally optimal--no local maxima!--since the code first had to prove that it is not useful to continue the proof search for alternative self-rewrites. Unlike previous non-self-referential methods based on hardwired proof searchers, ours not only boasts an optimal order of complexity but can optimally reduce any slowdowns hidden by the O()- notation, provided the utility of such speed-ups is provable at all.