A Generative Adversarial Networks tutorial applied to Image Deblurring with the Keras library.
In 2014, Ian Goodfellow introduced the Generative Adversarial Networks(GAN). This article focuses on applying GAN to Image Deblurring with Keras.
Have a look at the original scientific publication and its Pytorch version. All the Keras code for this article is available here.
Quick Reminder on Generative Adversarial Networks
In Generative Adversarial Networks, two networks train against each other. The generator misleads the discriminator by creating compelling fake inputs. The discriminator tells if an input is real or artificial.
There are 3 major steps in the training:
- use the generator to create fake inputs based on noise
- train the discriminator with both real and fake inputs
- train the whole model: the model is built with the discriminator chained to the generator.
Note that discriminator’s weights are frozen during the third step.
The reason for chaining both networks is that there is no possible feedback on the generator’s outputs. Our only measure is whether the discriminator accepted the generated samples.
This is a brief reminder of GAN’s architecture. If you don’t feel at ease, you can refer to this excellent introduction.
The Data
Ian Goodfellow first applied GAN models to generate MNIST data. In this tutorial, we use generative adversarial networks for image deblurring. Therefore, the generator’s input isn’t noise but blurred images.
The dataset is the GOPRO dataset. You can download a light version (9GB) or the complete version (35GB). It contains artificially blurred images from multiple street views. The dataset is decomposed in subfolders by scenes.
We first distribute the images into two folders A (blurred) and B (sharp). This A&B architecture corresponds to the original pix2pix article. I created a custom script to perform this task in the repo, follow the README to use it!
The Model
The training process stays the same. First, let’s take a look at the neural network architectures!
The Generator
The generator aims at reproducing sharp images. The network is based on ResNet blocks. It keeps track of the evolutions applied to the original blurred image. The publication also used a UNet based version, which I haven’t implemented. Both blocks should perform well for image deblurring.
The core is 9 ResNet blocks applied to an upsampling of the original image. Let’s see the Keras implementation!
This ResNet layer is basically a convolutional layer, with input and output added to form the final output.
As planned, the 9 ResNet blocks are applied to an upsampled version of the input. We add a connection from the input to the output and divide by 2 to keep normalized outputs.
That’s it for the generator! Let’s take a look at the discriminator’s architecture.
The Discriminator
The objective is to determine if an input image is artificially created. Therefore, the discriminator’s architecture is convolutional and outputs a single value.
The last step is to build the full model. A particularity of this GAN is that inputs are real images and not noise. Therefore, we have a direct feedback on the generator’s outputs.
Let’s see how we make the most of this particularity by using two losses.
The Training
Losses
We extract losses at two levels, at the end of the generator and at the end of the full model.
The first one is a perceptual loss computed directly on the generator’s outputs. This first loss ensures the GAN model is oriented towards a deblurring task. It compares the outputs of the first convolutions of VGG.
The second loss is the Wasserstein loss performed on the outputs of the whole model. It takes the mean of the differences between two images. It is known to improve convergence of generative adversarial networks.
Training Routine
The first step is to load the data and initialize all the models. We use our custom function to load the dataset, and add Adam optimizers for our models. We set the Keras trainable option to prevent the discriminator from training.
Then, we start launching the epochs and divide the dataset into batches.
Finally, we successively train the discriminator and the generator, based on both losses. We generate fake inputs with the generator. We train the discriminator to distinguish fake from real inputs, and we train the whole model.
You can refer to the Github repo to see the full loop!
Material
I used an AWS Instance (p2.xlarge) with the Deep Learning AMI (version 3.0). Training time was around 5 hours (for 50 epochs) on the light GOPRO dataset.
Image Deblurring Results
The output above is the result of our Keras Deblur GAN. Even on heavy blur, the network is able to reduce and form a more convincing image. Car lights are sharper, tree branches are clearer.
A limitation is the induced pattern on top of the image, which might be caused by the use of VGG as a loss.
If you have interest in computer vision, you can read this.
Below is the list of resources for Generative Adversarial Networks.
Resources for Generative Adversarial Networks
NIPS 2016: Generative Adversarial Networks by Ian Goodfellow ICCV 2017: Tutorials on GAN
GAN Implementations with Keras by Eric Linder-Noren A List of Generative Adversarial Networks Resources by deeplearning4j Really-awesome-gan by Holger Caesar
I hope you enjoyed this article on Generative Adversarial Networks for Image Deblurring!
Thanks to Antoine Toubhans, Alexandre Sapet, and Martin Müller.
If you are looking for Computer Vision Experts, don't hesitate to contact us !