So, today we advance to something slightly more interesting.
OpenAI, founded by Elon Musk and Sam Altman, is a cool platform for testing artificial intelligence algorithms. The platform aims to provide a consistent interface for a host of different environments. It is also extensible.
The most import thing in this mini-series of two posts is this: We provide an environment which reacts according to our action, but also with a random element. So the resulting reaction is a mixture of the intentional action plus a random factor.
This topic is split into two posts:
- In the first post we introduce a very simple windy walk model. To make things as simple as possible, this model is implemented without using OpenAI
- in the second post this same windy walk model is implemented as an extension to OpenAI
Let’s get started. We want to have an environment, implemented as a class, with two methods: reset (so the walk restarts at zero) and step (the walker takes a step and is shaken by the wind).
In [22]:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
import numpy as np class env(): def __init__(self): # the walk starts at 0 self.state = 0 return None def reset(self): # the walk again restarts at 0 self.state = 0 return None def step(self, action): IntendedStepSize = action # IntendedStepSize is step the walker wants to take # on average we allow him to achieve that (so it's the average step size) # ... however we subject the walker to a random influence, the influence of the wind # represented by np.random.normal() windInfluence = np.random.normal() # the change in the walker's position is the sum of the # his IntendedStepSize and the windInfluence exerted by the environment self.state += IntendedStepSize + windInfluence return None |
Next we take a walk…
In [23]:
1 2 3 4 5 6 7 8 9 10 11 12 |
## do walk with small IntendedStepSize: 0.1 # set random seed, for reproducibility np.random.seed(166) test = env() Walk1 = [] for i in range(50): test.step(0.1) Walk1.append(test.state) |
We import necessary functions for plotting….
In [24]:
1 2 3 4 5 6 7 |
# import matplotlib for plotting import matplotlib.style as style import matplotlib.pyplot as plt style.use('ggplot') plt.plot(Walk1) |
Out[24]:
Next we do a walk with a bigger IntendedStepSize
In [27]:
1 2 3 4 5 6 7 8 9 10 11 12 |
## do walk with small IntendedStepSize: 1 # set random seed, for reproducibility np.random.seed(167) test.reset() Walk2 = [] for i in range(50): test.step(1) Walk2.append(test.state) |
In [28]:
1 2 3 4 5 6 7 8 9 10 |
# import matplotlib for plotting import matplotlib.style as style import matplotlib.pyplot as plt style.use('ggplot') plt.plot(Walk1, label='Walk1, IntendedStepSize 0.1') plt.plot(Walk2, label='Walk2, IntendedStepSize 1') plt.legend(loc='best') plt.show() |