Testing probabilistic equivalence through Reinforcement Learning

Authors:
JoséE Desharnais;FrançOis Laviolette;Sami Zhioua
Affiliations:
Université Laval, Québec (QC), Canada;Université Laval, Québec (QC), Canada;ICS, King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
Venue:
Information and Computation
Year:
2013

Citing 35
Cited 0

Algebraic laws for nondeterminism and concurrency

Journal of the ACM (JACM)
Communicating sequential processes

Communicating sequential processes
Observation equivalence as a testing equivalence

Theoretical Computer Science
Equivalences, congruences, and complete axiomatizations for probabilistic processes

CONCUR '90 Proceedings on Theories of concurrency : unification and extension: unification and extension
An efficient global convergence detection scheme for parallel algorithms on transputer networks

OUG-12 Proceedings of the 12th Occam User Group technical meeting on Tools and techniques for transputer applications
A synchronous calculus of relative frequency

CONCUR '90 Proceedings on Theories of concurrency : unification and extension: unification and extension
Bisimulation through probabilistic testing

Information and Computation
Elements of information theory

Elements of information theory
Technical Note: \cal Q-Learning

Machine Learning
Temporal difference learning and TD-Gammon

Communications of the ACM
Reactive, generative, and stratified models of probabilistic processes

Information and Computation
Bisimulation for probabilistic transition systems: a coalgebraic approach

Theoretical Computer Science
Testing preorders for probabilistic processes

Information and Computation
A Calculus of Communicating Systems

A Calculus of Communicating Systems
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
The Design and Analysis of Computer Algorithms

The Design and Analysis of Computer Algorithms
A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes

Machine Learning
Black Box Checking

FORTE XII / PSTV XIX '99 Proceedings of the IFIP TC6 WG6.1 Joint International Conference on Formal Description Techniques for Distributed Systems and Communication Protocols (FORTE XII) and Protocol Specification, Testing and Verification (PSTV XIX)
Testing Equivalence for Processes

Proceedings of the 10th Colloquium on Automata, Languages and Programming
Testing Labelled Markov Processes

ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
The Linear Time - Branching Time Spectrum II

CONCUR '93 Proceedings of the 4th International Conference on Concurrency Theory
A Modal Characterisation of Observable Machine-Behaviour

CAAP '81 Proceedings of the 6th Colloquium on Trees in Algebra and Programming
Bisimulation for labelled Markov processes

Information and Computation - Special issue: LICS'97
Metrics for labelled Markov processes

Theoretical Computer Science - Logic, semantics and theory of programming
Bayesian sparse sampling for on-line reward optimization

ICML '05 Proceedings of the 22nd international conference on Machine learning
A testing scenario for probabilistic processes

Journal of the ACM (JACM)
Approximate Analysis of Probabilistic Processes: Logic, Simulation and Games

QEST '08 Proceedings of the 2008 Fifth International Conference on Quantitative Evaluation of Systems
Representing systems with hidden state

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Learning the Difference between Partially Observable Dynamical Systems

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Bandit based monte-carlo planning

ECML'06 Proceedings of the 17th European conference on Machine Learning
Testing probabilistic equivalence through reinforcement learning

FSTTCS'06 Proceedings of the 26th international conference on Foundations of Software Technology and Theoretical Computer Science
Trace equivalence characterization through reinforcement learning

AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Online testing with reinforcement learning

FATES'06/RV'06 Proceedings of the First combined international conference on Formal Approaches to Software Testing and Runtime Verification

Quantified Score

Hi-index	0.00

Visualization

Abstract

Checking if a given system implementation respects its specification is often done by proving that the two are ''equivalent''. The equivalence is chosen, in particular, for its computability and of course for its meaning, that is, for its adequacy with what is observable from the two systems (implementation and specification). Trace equivalence is easily testable (decidable from interaction), but often considered too weak; in contrast, bisimulation is accepted as the canonical equivalence for interaction, but it is not testable. Richer than an equivalence is a form of distance: it is zero between equivalent systems, and it provides an estimation of their difference if the systems are not equivalent. Our main contribution is to define such a distance in a context where (1) the two systems to be compared have a stochastic behavior; (2) the model of one of them (e.g., the implementation) is unknown, hence our only knowledge is obtained by interacting with it; (3) consequently the target equivalence (observed when distance is zero) must be testable. To overcome the problem that the model is unknown, we use a Reinforcement Learning approach that provides powerful stochastic algorithms that only need to interact with the model. Our second main contribution is a new family of testable equivalences, called K-moment. The weakest of them, 1-moment equivalence, is trace equivalence; as K grows, K-moment equivalences become finer, all remaining, as well as their limit, weaker than bisimulation. We propose a framework to define (and test) a bigger class of testable equivalences: Test-Observation-Equivalences (TOEs), and we show how they can be made coarser or not, by tuning some parameters.