Tag Archives: Game Theory

The seminal work on the “firstly cooperate, then repeat other’s actions” strategy in game theory

Robert Axelrod; William D. Hamilton, The Evolution of Cooperation, Science, New Series, Vol. 211, No. 4489. (Mar. 27, 1981), pp. 1390-1396 https://ee.stanford.edu/~hellman/Breakthrough/book/pdfs/axelrod.pdf.

Cooperation in organisms, whether bacteria or primates, has been a
difficulty for evolutionary theory since Darwin. On the assumption that interactions
between pairs of individuals occur on a probabilistic basis, a model is developed
based on the concept of an evolutionarily stable strategy in the context of the
Prisoner’s Dilemma game. Deductions from the model, and the results of a computer
tournament show how cooperation based on reciprocity can get started in an asocial
world, can thrive while interacting with a wide range of other strategies, and can resist
invasion once fully established. Potential applications include specific aspects of
territoriality, mating, and disease.

Nice summary of reinforcement learning in control (Adaptive Dynamic Programming) and the use of Q-learning plus NN approximators for solving a control problem under a game theory framework

Kyriakos G. Vamvoudakis, Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems, Automatica, Volume 61, November 2015, Pages 274-281, ISSN 0005-1098, DOI: 10.1016/j.automatica.2015.08.017.

This work proposes a novel Q-learning algorithm to solve the problem of non-zero sum Nash games of linear time invariant systems with N -players (control inputs) and centralized uncertain/unknown dynamics. We first formulate the Q-function of each player as a parametrization of the state and all other the control inputs or players. An integral reinforcement learning approach is used to develop a model-free structure of N -actors/ N -critics to estimate the parameters of the N -coupled Q-functions online while also guaranteeing closed-loop stability and convergence of the control policies to a Nash equilibrium. A 4th order, simulation example with five players is presented to show the efficacy of the proposed approach.