Denoising autoencoders with Keras, TensorFlow, and Deep Learning

In this tutorial, you will learn how to use autoencoders to denoise images using Keras, TensorFlow, and Deep Learning.

Today’s tutorial is part two in our three-part series on the applications of autoencoders:

Autoencoders with Keras, TensorFlow, and Deep Learning (last week’s tutorial)
Denoising autoenecoders with Keras, TensorFlow and Deep Learning (today’s tutorial)
Anomaly detection with Keras, TensorFlow, and Deep Learning (next week’s tutorial)

Last week you learned the fundamentals of autoencoders, including how to train your very first autoencoder using Keras and TensorFlow — however, the real-world application of that tutorial was admittedly a bit limited due to the fact that we needed to lay the groundwork.

Today, we’re going to take a deeper dive and learn how autoencoders can be used for denoising, also called “noise reduction,” which is the process of removing noise from a signal.

The term “noise” here could be:

Produced by a faulty or poor quality image sensor
Random variations in brightness or color
Quantization noise
Artifacts due to JPEG compression
Image perturbations produced by an image scanner or threshold post-processing
Poor paper quality (crinkles and folds) when trying to perform OCR

From the perspective of image processing and computer vision, you should think of noise as anything that could be removed by a really good pre-processing filter.

Our goal is to train an autoencoder to perform such pre-processing — we call such models denoising autoencoders.

To learn how to train a denoising autoencoder with Keras and TensorFlow, just keep reading!

Looking for the source code to this post?

Denoising autoencoders with Keras, TensorFlow, and Deep Learning

In the first part of this tutorial, we’ll discuss what denoising autoencoders are and why we may want to use them.

From there I’ll show you how to implement and train a denoising autoencoder using Keras and TensorFlow.

We’ll wrap up this tutorial by examining the results of our denoising autoencoder.

What are denoising autoencoders, and why would we use them?

**Figure 1:** A denoising autoencoder processes a noisy image, generating a clean image on the output side. Can we learn how to train denoising autoencoders with Keras, TensorFlow, and Deep Learning today in less than an hour? (image source)

Denoising autoencoders are an extension of simple autoencoders; however, it’s worth noting that denoising autoencoders were not originally meant to automatically denoise an image.

Instead, the denoising autoencoder procedure was invented to help:

The hidden layers of the autoencoder learn more robust filters
Reduce the risk of overfitting in the autoencoder
Prevent the autoencoder from learning a simple identify function

In Vincent et al.’s 2008 ICML paper, Extracting and Composing Robust Features with Denoising Autoencoders, the authors found that they could improve the robustness of their internal layers (i.e., latent-space representation) by purposely introducing noise to their signal.

Noise was stochastically (i.e., randomly) added to the input data, and then the autoencoder was trained to recover the original, nonperturbed signal.

From an image processing standpoint, we can train an autoencoder to perform automatic image pre-processing for us.

A great example would be pre-processing an image to improve the accuracy of an optical character recognition (OCR) algorithm. If you’ve ever applied OCR before, you know how just a little bit of the wrong type of noise (ex., printer ink smudges, poor image quality during the scan, etc.) can dramatically hurt the performance of your OCR method. Using denoising autoencoders, we can automatically pre-process the image, improve the quality, and therefore increase the accuracy of the downstream OCR algorithm.

If you’re interested in learning more about denoising autoencoders, I would strongly encourage you to read this article as well Bengio and Delalleau’s paper, Justifying and Generalizing Contrastive Divergence.

For more information on denoising autoencoders for OCR-related preprocessing, take a look at this dataset on Kaggle.

Configuring your development environment

To follow along with today’s tutorial on autoencoders, you should use TensorFlow 2.0. I have two installation tutorials for TF 2.0 and associated packages to bring your development system up to speed:

How to install TensorFlow 2.0 on Ubuntu (Ubuntu 18.04 OS; CPU and optional NVIDIA GPU)
How to install TensorFlow 2.0 on macOS (Catalina and Mojave OSes)

Please note: PyImageSearch does not support Windows — refer to our FAQ.

Project structure

Go ahead and grab the .zip from the “Downloads” section of today’s tutorial. From there, extract the zip.

You’ll be presented with the following project layout:

$ tree --dirsfirst
.
├── pyimagesearch
│   ├── __init__.py
│   └── convautoencoder.py
├── output.png
├── plot.png
└── train_denoising_autoencoder.py

1 directory, 5 files

The pyimagesearch module contains the ConvAutoencoder class. We reviewed this class in our previous tutorial; however, we’ll briefly walk through it again today.

The heart of today’s tutorial is inside the train_denoising_autoencoder.py Python training script. This script is different from the previous tutorial in one main way:

We will purposely add noise to our MNIST training images using a random normal distribution centered at 0.5 with a standard deviation of 0.5.

The purpose of adding noise to our training data is so that our autoencoder can effectively remove noise from an input image (i.e., denoise).

Implementing our denoising autoencoder with Keras and TensorFlow

The denoising autoencoder we’ll be implementing today is essentially identical to the one we implemented in last week’s tutorial on autoencoder fundamentals.

We’ll review the model architecture here today as a matter of completeness, but make sure you refer to last week’s guide for more details.

With that said, open up the convautoencoder.py file in your project structure, and insert the following code:

# import the necessary packages
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import Conv2DTranspose
from tensorflow.keras.layers import LeakyReLU
from tensorflow.keras.layers import Activation
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Reshape
from tensorflow.keras.layers import Input
from tensorflow.keras.models import Model
from tensorflow.keras import backend as K
import numpy as np

class ConvAutoencoder:
	@staticmethod
	def build(width, height, depth, filters=(32, 64), latentDim=16):
		# initialize the input shape to be "channels last" along with
		# the channels dimension itself
		# channels dimension itself
		inputShape = (height, width, depth)
		chanDim = -1

		# define the input to the encoder
		inputs = Input(shape=inputShape)
		x = inputs

Imports include tf.keras and NumPy.

Our ConvAutoencoder class contains one static method, build which accepts five parameters:

width: Width of the input image in pixels
height: Heigh of the input image in pixels
depth: Number of channels (i.e., depth) of the input volume
filters: A tuple that contains the set of filters for convolution operations. By default, if this parameter is not provided by the caller, we’ll add two sets of CONV => RELU => BN with 32 and 64 filters
latentDim: The number of neurons in our fully-connected (Dense) latent vector. By default, if this parameter is not passed, the value is set to 16

From there, we initialize the inputShape and define the Input to the encoder (Lines 25 and 26).

Let’s begin building our encoder’s filters:

		# loop over the number of filters
		for f in filters:
			# apply a CONV => RELU => BN operation
			x = Conv2D(f, (3, 3), strides=2, padding="same")(x)
			x = LeakyReLU(alpha=0.2)(x)
			x = BatchNormalization(axis=chanDim)(x)

		# flatten the network and then construct our latent vector
		volumeSize = K.int_shape(x)
		x = Flatten()(x)
		latent = Dense(latentDim)(x)

		# build the encoder model
		encoder = Model(inputs, latent, name="encoder")

Using Keras’ functional API, we go ahead and Loop over number of filters and add our sets of CONV => RELU => BN layers (Lines 29-33).

We then flatten the network and construct our latent vector (Lines 36-38). The latent-space representation is the compressed form of our data.

From there, we build the encoder portion of our autoencoder (Line 41).

Next, we’ll use our latent-space representation to reconstruct the original input image.

		# start building the decoder model which will accept the
		# output of the encoder as its inputs
		latentInputs = Input(shape=(latentDim,))
		x = Dense(np.prod(volumeSize[1:]))(latentInputs)
		x = Reshape((volumeSize[1], volumeSize[2], volumeSize[3]))(x)

		# loop over our number of filters again, but this time in
		# reverse order
		for f in filters[::-1]:
			# apply a CONV_TRANSPOSE => RELU => BN operation
			x = Conv2DTranspose(f, (3, 3), strides=2,
				padding="same")(x)
			x = LeakyReLU(alpha=0.2)(x)
			x = BatchNormalization(axis=chanDim)(x)

		# apply a single CONV_TRANSPOSE layer used to recover the
		# original depth of the image
		x = Conv2DTranspose(depth, (3, 3), padding="same")(x)
		outputs = Activation("sigmoid")(x)

		# build the decoder model
		decoder = Model(latentInputs, outputs, name="decoder")

		# our autoencoder is the encoder + decoder
		autoencoder = Model(inputs, decoder(encoder(inputs)),
			name="autoencoder")

		# return a 3-tuple of the encoder, decoder, and autoencoder
		return (encoder, decoder, autoencoder)

Here, we are taking the latent input and use a fully-connected layer to reshape it into a 3D volume (i.e., the image data).

We loop over our filters again, but in reverse order, applying CONV_TRANSPOSE => RELU => BN layers where the CONV_TRANSPOSE layer’s purpose is to increase the volume size.

Finally, we build the decoder model and construct the autoencoder. Remember, the concept of an autoencoder — discussed last week — consists of both the encoder and decoder components.

Implementing the denoising autoencoder training script

Let’s now implement the training script used to:

Add stochastic noise to the MNIST dataset
Train a denoising autoencoder on the noisy dataset
Automatically recover the original digits from the noise

My implementation follows Francois Chollet’s own implementation of denoising autoencoders on the official Keras blog — my primary contribution here is to go into a bit more detail regarding the implementation itself.

Open up the train_denoising_autoencoder.py file, and insert the following code:

# set the matplotlib backend so figures can be saved in the background
import matplotlib
matplotlib.use("Agg")

# import the necessary packages
from pyimagesearch.convautoencoder import ConvAutoencoder
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.datasets import mnist
import matplotlib.pyplot as plt
import numpy as np
import argparse
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-s", "--samples", type=int, default=8,
	help="# number of samples to visualize when decoding")
ap.add_argument("-o", "--output", type=str, default="output.png",
	help="path to output visualization file")
ap.add_argument("-p", "--plot", type=str, default="plot.png",
	help="path to output plot file")
args = vars(ap.parse_args())

On Lines 2-12 we handle our imports. We’ll use the "Agg" backend of matplotlib so that we can export our training plot to disk. Our custom ConvAutoencoder class implemented in the previous section contains the autoencoder architecture itself. Modeling after Chollet’s example, we will also use the Adam optimizer.

Our script accepts three optional command line arguments:

--samples: The number of output samples for visualization. By default this value is set to 8.
--output: The path to the output visualization image. We’ll name our visualization output.png by default.
--plot: The path to our matplotlib output plot. A default of plot.png is assigned if this argument is not provided in the terminal.

Next, we initialize hyperparameters and preprocess our MNIST dataset:

# initialize the number of epochs to train for and batch size
EPOCHS = 25
BS = 32

# load the MNIST dataset
print("[INFO] loading MNIST dataset...")
((trainX, _), (testX, _)) = mnist.load_data()

# add a channel dimension to every image in the dataset, then scale
# the pixel intensities to the range [0, 1]
trainX = np.expand_dims(trainX, axis=-1)
testX = np.expand_dims(testX, axis=-1)
trainX = trainX.astype("float32") / 255.0
testX = testX.astype("float32") / 255.0

Our training epochs will be 25 and we’ll use a batch size of 32.

We go ahead and grab the MNIST dataset (Line 30) while Lines 34-37 (1) add a channel dimension to every image in the dataset, and (2) scale the pixel intensities to the range [0, 1].

At this point, we’ll deviate from last week’s tutorial:

# sample noise from a random normal distribution centered at 0.5 (since
# our images lie in the range [0, 1]) and a standard deviation of 0.5
trainNoise = np.random.normal(loc=0.5, scale=0.5, size=trainX.shape)
testNoise = np.random.normal(loc=0.5, scale=0.5, size=testX.shape)
trainXNoisy = np.clip(trainX + trainNoise, 0, 1)
testXNoisy = np.clip(testX + testNoise, 0, 1)

To add random noise to the MNIST digits, we use NumPy’s random normal distribution centered at 0.5 with a standard deviation of 0.5 (Lines 41-44).

The following figure shows an example of how our images look before (left) adding noise followed by after (right):

**Figure 2:** Prior to training a denoising autoencoder on MNIST with Keras, TensorFlow, and Deep Learning, we take input images *(left)* and deliberately add noise to them *(right)*.

As you can see, our images are quite corrupted — recovering the original digit from the noise will require a powerful model.

Luckily, our denoising autoencoder will be up to the task:

# construct our convolutional autoencoder
print("[INFO] building autoencoder...")
(encoder, decoder, autoencoder) = ConvAutoencoder.build(28, 28, 1)
opt = Adam(lr=1e-3)
autoencoder.compile(loss="mse", optimizer=opt)

# train the convolutional autoencoder
H = autoencoder.fit(
	trainXNoisy, trainX,
	validation_data=(testXNoisy, testX),
	epochs=EPOCHS,
	batch_size=BS)

# construct a plot that plots and saves the training history
N = np.arange(0, EPOCHS)
plt.style.use("ggplot")
plt.figure()
plt.plot(N, H.history["loss"], label="train_loss")
plt.plot(N, H.history["val_loss"], label="val_loss")
plt.title("Training Loss and Accuracy")
plt.xlabel("Epoch #")
plt.ylabel("Loss/Accuracy")
plt.legend(loc="lower left")
plt.savefig(args["plot"])

Line 48 builds our denoising autoencoder, passing the necessary arguments. Using our Adam optimizer with an initial learning rate of 1e-3, we go ahead and compile the autoencoder with mean-squared error loss (Lines 49 and 50).

Training is launched via Lines 53-57. Using the training history data, H, Lines 60-69 plot the loss, saving the resulting figure to disk.

Let’s write a quick loop that will help us visualize the denoising autoencoder results:

# use the convolutional autoencoder to make predictions on the
# testing images, then initialize our list of output images
print("[INFO] making predictions...")
decoded = autoencoder.predict(testXNoisy)
outputs = None

# loop over our number of output samples
for i in range(0, args["samples"]):
	# grab the original image and reconstructed image
	original = (testXNoisy[i] * 255).astype("uint8")
	recon = (decoded[i] * 255).astype("uint8")

	# stack the original and reconstructed image side-by-side
	output = np.hstack([original, recon])

	# if the outputs array is empty, initialize it as the current
	# side-by-side image display
	if outputs is None:
		outputs = output

	# otherwise, vertically stack the outputs
	else:
		outputs = np.vstack([outputs, output])

# save the outputs image to disk
cv2.imwrite(args["output"], outputs)

We go ahead and use our trained autoencoder to remove the noise from the images in our testing set (Line 74).

We then grab N --samples worth of original and reconstructed data, and put together a visualization montage (Lines 78-93). Line 96 writes the visualization figure to disk for inspection.

Training the denoising autoencoder with Keras and TensorFlow

To train your denoising autoencoder, make sure you use the “Downloads” section of this tutorial to download the source code.

From there, open up a terminal and execute the following command:

$ python train_denoising_autoencoder.py --output output_denoising.png \
	--plot plot_denoising.png
[INFO] loading MNIST dataset...
[INFO] building autoencoder...
Train on 60000 samples, validate on 10000 samples
Epoch 1/25
60000/60000 [==============================] - 85s 1ms/sample - loss: 0.0285 - val_loss: 0.0191
Epoch 2/25
60000/60000 [==============================] - 83s 1ms/sample - loss: 0.0187 - val_loss: 0.0211
Epoch 3/25
60000/60000 [==============================] - 84s 1ms/sample - loss: 0.0177 - val_loss: 0.0174
Epoch 4/25
60000/60000 [==============================] - 84s 1ms/sample - loss: 0.0171 - val_loss: 0.0170
Epoch 5/25
60000/60000 [==============================] - 83s 1ms/sample - loss: 0.0167 - val_loss: 0.0177
...
Epoch 21/25
60000/60000 [==============================] - 67s 1ms/sample - loss: 0.0146 - val_loss: 0.0161
Epoch 22/25
60000/60000 [==============================] - 67s 1ms/sample - loss: 0.0145 - val_loss: 0.0164
Epoch 23/25
60000/60000 [==============================] - 67s 1ms/sample - loss: 0.0145 - val_loss: 0.0158
Epoch 24/25
60000/60000 [==============================] - 67s 1ms/sample - loss: 0.0144 - val_loss: 0.0155
Epoch 25/25
60000/60000 [==============================] - 66s 1ms/sample - loss: 0.0144 - val_loss: 0.0157
[INFO] making predictions...

**Figure 3:** Example results from training a deep learning denoising autoencoder with Keras and Tensorflow on the MNIST benchmarking dataset. Inside our training script, we added random noise with NumPy to the MNIST images.

Training the denoising autoencoder on my iMac Pro with a 3 GHz Intel Xeon W processor took ~32.20 minutes.

As Figure 3 shows, our training process was stable and shows no signs of overfitting.

Denoising autoencoder results

Our denoising autoencoder has been successfully trained, but how did it perform when removing the noise we added to the MNIST dataset?

To answer that question, take a look at Figure 4:

**Figure 4:** The results of removing noise from MNIST images using a denoising autoencoder trained with Keras, TensorFlow, and Deep Learning.

On the left we have the original MNIST digits that we added noise to while on the right we have the output of the denoising autoencoder — we can clearly see that the denoising autoencoder was able to recover the original signal (i.e., digit) from the image while removing the noise.

More advanced denosing autoencoders can be used to automatically pre-process images to facilitate better OCR accuracy.

What's next? I recommend PyImageSearch University.

Course information:
30+ total classes • 39h 44m video • Last updated: 12/2021
★★★★★ 4.84 (128 Ratings) • 3,000+ Students Enrolled

I strongly believe that if you had the right teacher you could master computer vision and deep learning.

Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?

That’s not the case.

All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.

If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.

Inside PyImageSearch University you'll find:

✓ 30+ courses on essential computer vision, deep learning, and OpenCV topics
✓ 30+ Certificates of Completion
✓ 39h 44m on-demand video
✓ Brand new courses released every month, ensuring you can keep up with state-of-the-art techniques
✓ Pre-configured Jupyter Notebooks in Google Colab
✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
✓ Access to centralized code repos for all 500+ tutorials on PyImageSearch
✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
✓ Access on mobile, laptop, desktop, etc.

Click here to join PyImageSearch University

Summary

In this tutorial, you learned about denoising autoencoders, which, as the name suggests, are models that are used to remove noise from a signal.

In the context of computer vision, denoising autoencoders can be seen as very powerful filters that can be used for automatic pre-processing. For example, a denoising autoencoder could be used to automatically pre-process an image, improving its quality for an OCR algorithm and thereby increasing OCR accuracy.

To demonstrate a denoising autoencoder in action, we added noise to the MNIST dataset, greatly degrading the image quality to the point where any model would struggle to correctly classify the digit in the image. Using our denoising autoencoder, we were able to remove the noise from the image, recovering the original signal (i.e., the digit).

In next week’s tutorial, you’ll learn about another real-world application of autoencoders — anomaly and outlier detection.

To download the source code to this post (and be notified when future tutorials are published here on PyImageSearch), just enter your email address in the form below!

Download the Source Code and FREE 17-page Resource Guide

Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!

Looking for the source code to this post?

Denoising autoencoders with Keras, TensorFlow, and Deep Learning

What are denoising autoencoders, and why would we use them?

Configuring your development environment

Project structure

Implementing our denoising autoencoder with Keras and TensorFlow

Implementing the denoising autoencoder training script

Training the denoising autoencoder with Keras and TensorFlow

Denoising autoencoder results

What's next? I recommend PyImageSearch University.

Summary

Download the Source Code and FREE 17-page Resource Guide

About the Author

Image Stitching with OpenCV and Python

Resolved: Matplotlib figures not showing up or displaying

Object detection with deep learning and OpenCV

Topics

Books & Courses

PyImageSearch

Looking for the source code to this post?

Denoising autoencoders with Keras, TensorFlow, and Deep Learning

What are denoising autoencoders, and why would we use them?

Configuring your development environment

Project structure

Implementing our denoising autoencoder with Keras and TensorFlow

Implementing the denoising autoencoder training script

Training the denoising autoencoder with Keras and TensorFlow

Denoising autoencoder results

What's next? I recommend PyImageSearch University.

Summary

Download the Source Code and FREE 17-page Resource Guide

About the Author

Similar articles

You can learn Computer Vision, Deep Learning, and OpenCV.

Footer

Topics

Books & Courses

PyImageSearch

Access the code to this tutorial and all other 500+ tutorials on PyImageSearch

What's included in PyImageSearch University?