In this short post we perform a comparative analysis of a very simple regression problem in tensorflow and keras.
We start off with an eye-catching plot, representing the functioning of an optimizer using the stochastic gradient method. The plot is explained in more detail further below.
The focus is on the first principles of gradient descent. We replicate the results of 1,2. The post uses a Gradient Tape which in turn makes use of Automatic differentiation 3,4.
In the original implementation in 1, the training and testing data are not separate. The motivation behind the original version is – doubtless – to keep things as simple as possible, and to omit everything unimportant. We feel however, that it might be confusing to not have the training / testing split. Therefore we use a train/test split in the notebooks covered in this post.
Here we present first a “split-variation” of the original version, where the training and testing are in fact split.
We add two more notebooks that are replications of the split-variation, these are in particular:
- A tensorflow-based replication with a standard optimizer
- A tensorflow/keras implementation.
Please note that all three workbooks are self.contained. Moreover, the results are exactly the same between the notebooks.
As usual the code/notebooks can be found on github: