Keras Implementation of Deblur GAN : Application to Image Deblurring

Rédigé par Raphaël Meudec

A Generative Adversarial Networks tutorial applied to Image Deblurring with the Keras library.

In 2014, Ian Goodfellow introduced the Generative Adversarial Networks(GAN). This article focuses on applying GAN to Image Deblurring with Keras.

Have a look at the original scientific publication and its Pytorch version. All the Keras code for this article is available here.

Quick Reminder on Generative Adversarial Networks

In Generative Adversarial Networks, two networks train against each other. The generator misleads the discriminator by creating compelling fake inputs. The discriminator tells if an input is real or artificial.

There are 3 major steps in the training:

use the generator to create fake inputs based on noise
train the discriminator with both real and fake inputs
train the whole model: the model is built with the discriminator chained to the generator.

Note that discriminator’s weights are frozen during the third step.

The reason for chaining both networks is that there is no possible feedback on the generator’s outputs. Our only measure is whether the discriminator accepted the generated samples.

This is a brief reminder of GAN’s architecture. If you don’t feel at ease, you can refer to this excellent introduction.

The Data

Ian Goodfellow first applied GAN models to generate MNIST data. In this tutorial, we use generative adversarial networks for image deblurring. Therefore, the generator’s input isn’t noise but blurred images.

The dataset is the GOPRO dataset. You can download a light version (9GB) or the complete version (35GB). It contains artificially blurred images from multiple street views. The dataset is decomposed in subfolders by scenes.

We first distribute the images into two folders A (blurred) and B (sharp). This A&B architecture corresponds to the original pix2pix article. I created a custom script to perform this task in the repo, follow the README to use it!

The Model

The training process stays the same. First, let’s take a look at the neural network architectures!

The Generator

The generator aims at reproducing sharp images. The network is based on ResNet blocks. It keeps track of the evolutions applied to the original blurred image. The publication also used a UNet based version, which I haven’t implemented. Both blocks should perform well for image deblurring.

The Architecture of the DeblurGAN generator network — Source

The core is 9 ResNet blocks applied to an upsampling of the original image. Let’s see the Keras implementation!

This ResNet layer is basically a convolutional layer, with input and output added to form the final output.

Keras Implementation of Generator’s Architecture

As planned, the 9 ResNet blocks are applied to an upsampled version of the input. We add a connection from the input to the output and divide by 2 to keep normalized outputs.

That’s it for the generator! Let’s take a look at the discriminator’s architecture.

The Discriminator

The objective is to determine if an input image is artificially created. Therefore, the discriminator’s architecture is convolutional and outputs a single value.

Keras Implementation of Discriminator’s architecture

The last step is to build the full model. A particularity of this GAN is that inputs are real images and not noise. Therefore, we have a direct feedback on the generator’s outputs.

Let’s see how we make the most of this particularity by using two losses.

The Training

Losses

We extract losses at two levels, at the end of the generator and at the end of the full model.

The first one is a perceptual loss computed directly on the generator’s outputs. This first loss ensures the GAN model is oriented towards a deblurring task. It compares the outputs of the first convolutions of VGG.

The second loss is the Wasserstein loss performed on the outputs of the whole model. It takes the mean of the differences between two images. It is known to improve convergence of generative adversarial networks.

Training Routine

The first step is to load the data and initialize all the models. We use our custom function to load the dataset, and add Adam optimizers for our models. We set the Keras trainable option to prevent the discriminator from training.

Then, we start launching the epochs and divide the dataset into batches.

Finally, we successively train the discriminator and the generator, based on both losses. We generate fake inputs with the generator. We train the discriminator to distinguish fake from real inputs, and we train the whole model.

You can refer to the Github repo to see the full loop!

Material

I used an AWS Instance (p2.xlarge) with the Deep Learning AMI (version 3.0). Training time was around 5 hours (for 50 epochs) on the light GOPRO dataset.

Image Deblurring Results

From Left to Right: Original Image, Blurred Image, GAN Output

The output above is the result of our Keras Deblur GAN. Even on heavy blur, the network is able to reduce and form a more convincing image. Car lights are sharper, tree branches are clearer.