Game of Nim, Reinforcement Learning

Intro

In this post we report success in using reinforcement to learn the game of nim. We had previously cited two theses (ERIK JÄRLEBERG (2011) and  PAUL GRAHAM & WILLIAM LORD (2015)) that used Q-learning to learn the game of nim. However, in this setting, the scaling issues with Q-learning are much more severe than with value-learning. In this post we use a value-based approach with a table. Because the value-based approach is much more efficient than Q-learning no functional approximation is needed, up to reasonable heap sizes.

Continue reading “Game of Nim, Reinforcement Learning”