Game of Nim

This post starts a mini series of two posts in which we want to solve the Game of Nim using Reinforcement learning. The first part of this mini series is devoted to having a look at using our own Nim-specific custom environment for OpenAI.

The Game of Nim is a simple two player game. The rules of the game are indeed very simple, however, to play it well is difficult for the uninitiated.

Continue reading “Game of Nim”

Windy Walk (part 2) – addendum


  1. Previously we constructed a very simple class to emulate the type of environment which is provided by OpenAI. Windy Walk (part 1)
  2. Then we implemented the same windy walk model as an extension to OpenAI. For this we created a custom python package, we named it gym_drifty_walk, you can grab it from github .
  3. These two versions we used, to produce exactly the same “Windy Walks” results, as shown by the plots.

I this addendum post we’ll take a look at  the package code in gym_drifty_walk.

There are excellent articles about how to create a python package, I have no intention to duplicate those, I recommend python-packaging. One warning: the packaging article is written for python 2, so be aware of that.

There are things missing in this (like e.g. tests) that you would typically add. We choose to use the most basic approach that works, which we believe lowers the barrier to comprehension.

The most basic steps you need are these:

  • Choose a package name
    We have already done that:  gym_drifty_walk
  • ✓ Follow the basic package structure
    Here the structure is like this :

    (Remark; This tree-representation of the directory structure can be obtained using tree )

    Continue reading “Windy Walk (part 2) – addendum”

Windy Walk (part 2)

Recap: Previously we constructed a very simple class to emulate the type of environment which is provided by OpenAI. Then we simulated two windy walks and used the result of these walks to produce some plots.

Now we want to produce the same results again but via a different and more interesting route. This time we want a windy walk model that is implemented as an extension to OpenAI.

Continue reading “Windy Walk (part 2)”

Windy Walk

So, today we advance to something slightly more interesting.
OpenAI, founded by Elon Musk and Sam Altman, is a cool platform for testing artificial intelligence algorithms. The platform aims to provide a consistent interface for a host of different environments. It is also extensible.
The most import thing in this mini-series of two posts is this: We provide an environment which reacts according to our action, but also with a random element. So the resulting reaction is a mixture of the intentional action plus a random factor.
This topic is split into two posts:
  1. In the first post we introduce a very simple windy walk model. To make things as simple as possible, this model is implemented without using OpenAI
  2. in the second post this same windy walk model is implemented as an extension to OpenAI
Let’s get started. We want to have an environment, implemented as a class, with two methods: reset (so the walk restarts at zero) and step (the walker takes a step and is shaken by the wind).

In [22]:

Next we take a walk…
In [23]:
We import necessary functions for plotting….
In [24]:
Next we do a walk with a bigger IntendedStepSize
In [27]:
In [28]: