So, today we advance to something slightly more interesting.

OpenAI, founded by Elon Musk and Sam Altman, is a cool platform for testing artificial intelligence algorithms. The platform aims to provide a consistent interface for a host of different environments. It is also extensible.

The most import thing in this mini-series of two posts is this: We provide an environment which reacts according to our action, but also with a random element. So the resulting reaction is a mixture of the intentional action plus a random factor.

This topic is split into two posts:

In the first post we introduce a very simple windy walk model. To make things as simple as possible, this model is implemented without using OpenAI
in the second post this same windy walk model is implemented as an extension to OpenAI

Let’s get started. We want to have an environment, implemented as a class, with two methods: reset (so the walk restarts at zero) and step (the walker takes a step and is shaken by the wind).

In [22]:

import numpy as np

class env():
    
    def __init__(self):
        
        # the walk starts at 0
        self.state = 0
        return None
        
    def reset(self):
        
        # the walk again restarts at 0
        self.state = 0
        return None
    
    def step(self, action):
        
        IntendedStepSize = action
        
        # IntendedStepSize is step the walker wants to take
        # on average we allow him to achieve that (so it's the average step size)
        
        # ... however we subject the walker to a random influence, the influence of the wind
        #   represented by np.random.normal()
        windInfluence = np.random.normal()
        
        # the change in the walker's position is the sum of the 
        # his IntendedStepSize and the windInfluence exerted by the environment
        
        self.state += IntendedStepSize + windInfluence
        
        
        return None

Next we take a walk…

In [23]:

## do walk with small IntendedStepSize: 0.1

# set random seed, for reproducibility
np.random.seed(166)

test = env()
Walk1 = []

for i in range(50):
    
    test.step(0.1)
    Walk1.append(test.state)

We import necessary functions for plotting….

In [24]:

# import matplotlib for plotting        
import matplotlib.style as style
import matplotlib.pyplot as plt
style.use('ggplot')


plt.plot(Walk1)

Out[24]:

[<matplotlib.lines.Line2D at 0x7ffb2b0eca20>]

Next we do a walk with a bigger IntendedStepSize

In [27]:

## do walk with small IntendedStepSize: 1

# set random seed, for reproducibility
np.random.seed(167)

test.reset()
Walk2 = []

for i in range(50):
    
    test.step(1)
    Walk2.append(test.state)

In [28]:

# import matplotlib for plotting        
import matplotlib.style as style
import matplotlib.pyplot as plt
style.use('ggplot')


plt.plot(Walk1, label='Walk1, IntendedStepSize 0.1')
plt.plot(Walk2, label='Walk2, IntendedStepSize 1')
plt.legend(loc='best')
plt.show()