But I digress…
GANs, or Generative Adversarial Networks, represent the most exciting technology I have encountered since way back when i got my news from Usenet.
Generative Adversarial Networks
GANs consist of 2 neural networks:
a generator (G)
a discriminator (D)
The classic analogy is:
The Forger (G – the generator)
The Detective (D – the discriminator)
The goal of the forger is to generate “paintings” that are good enough to fool the detective. The Detective’s goal is determine the real paintings from the fake ones.
GAN Generated “birds”
We’ll use images as an example through out this post but keep in mind that GANs can generate speech, words, video, music, etc.
The Generator (G)
The generator starts with latent data which can be little more than random noise. G slowly generates better and better images in an attempt to fool the discriminator.
Technically G is learning to map latent samples (random vectors) and tuning hyperparameters to generate images that are indistinguishable ( from the point of view of D ) from the real data.
The generator is, in effect, a reverse convolution network. Instead of taking an image and down sampling it to produce a probability, G takes a vector of random noise and up samples it to create an image.
The Discriminator is the detective from the analogy. The D “discriminates” or judges the “paintings” and tries to determine “forgeries”. The D has seen LOTS of real images (through the training data) so when it is shown a test painting to judge it predicts:
0 = Fake
1 = Real
That is, it’s fake if the D thinks it was generated, and it’s real if it thinks that it is training data. The discriminator network is basically a standard binomial classifier network that categorizes images it’s fed as True(1) or False(0).
The Adversarial in Generative Adversarial Networks
The 2 networks ( D and G ) are trying to optimize a different and opposing loss function, in a zero sum game. You may have heard this called minimax in game theory
This brings up an inherent problem. Each side of the GAN can potentially overpower the other.
If the discriminator is too good, it will return values so close to 0 or 1 that the generator will struggle to read the gradient. If the generator is too good, it will persistently exploit weaknesses in the discriminator that lead to false negatives.
So the idea is to create a system that generates relatively 0.5 consistently.
You can see my Python code for a GAN that works with the MNIST DataSet.