Adversarial images and attacks with Keras and TensorFlow

In this tutorial, you will learn how to break deep learning models using image-based adversarial attacks. We will implement our adversarial attacks using the Keras and TensorFlow deep learning libraries.

Imagine it’s twenty years from now. Nearly all cars and trucks on the road have been replaced with autonomous vehicles, powered by Artificial Intelligence, deep learning, and computer vision — every turn, lane switch, acceleration, and brake is powered by a deep neural network.

Now, imagine you’re on the highway. You’re sitting in the “driver’s seat” (is it really a “driver’s seat” if the car is doing the driving?) while your spouse is in the passenger seat, and your kids are in the back.

Looking ahead, you see a large sticker plastered on the lane your car is driving in. It looks innocent enough. It’s just a big print of the graffiti artist Banksy’s popular Girl with Balloon work. Some high school kids probably just put it there as part of a weird dare/practical joke.

**Figure 1:** Performing an adversarial attack requires taking an input image *(left)*, purposely perturbing it with a noise vector *(middle)*, which forces the network to misclassify the input image, ultimately resulting in an incorrect classification, potentially with major consequences *(right).*

A split second later, your car reacts by violently breaking hard and then switching lanes as if the large art print plastered on the road is a human, an animal, or another vehicle. You’re jerked so hard that you feel the whiplash. Your spouse screams while Cheerios from your kid in the backseat rocket forward, hitting the windshield and bouncing all over the center console.

You and your family are safe … but it could have been a lot worse.

What happened? Why did your self-driving car react that way? Was it some sort of weird “bug” in the code/software your car is running?

The answer is that the deep neural network powering the “sight” component of your vehicle just saw an adversarial image.

Adversarial images are:

Images that have pixels purposely and intentionally perturbed to confuse and deceive models …
… but at the same time, look harmless and innocent to humans.

These images cause deep neural networks to purposely make incorrect predictions. Adversarial images are perturbed in such a way that the model is unable to correctly classify them.

In fact, it may be impossible for humans to visually identify a normal image from one that has been visually perturbed for an adversarial attack — essentially, the two images will appear identical to the human eye.

While not an exact (or correct) comparison, I like to explain adversarial attacks in the context of image steganography. Using steganography algorithms, we can embed data (such as plaintext messages) in an image without distorting the appearance of the image itself. This image can be innocently transmitted to the receiver, who can then extract the hidden message from the image.

Similarly, adversarial attacks embed a message in an input image — but instead of a plaintext message meant for human consumption, an adversarial attack instead embeds a noise vector in the input image. This noise vector is purposely constructed to fool and confuse deep learning models.

But how do adversarial attacks work? And how can we defend against them?

This tutorial, along with the rest of the posts in this series, will cover that exact same question.

To learn how to break deep learning models with adversarial attacks and images using Keras/TensorFlow, just keep reading.

Looking for the source code to this post?

Adversarial images and attacks with Keras and TensorFlow

In the first part of this tutorial, we’ll discuss what adversarial attacks are and how they impact deep learning models.

From there, we’ll implement three separate Python scripts:

The first one will be a helper utility used to load and parse class labels from the ImageNet dataset.
Our next Python script will perform basic image classification using ResNet, pre-trained on the ImageNet dataset (thereby demonstrating “standard” image classification).
The final Python script will perform an adversarial attack and construct an adversarial image that purposely confuses our ResNet model, even though the two images look identical to the human eye.

Let’s get started!

What are adversarial images and adversarial attacks? And how to they impact deep learning models?

**Figure 2:** When performing an adversarial attack, we present an input image *(left)* to our neural network. We then use gradient descent to construct the noise vector *(middle).* This noise vector is added to the input image, resulting in a misclassification *(right)*. (*Image source*: Figure 1 of *Explaining and Harnessing Adversarial Examples)*

In 2014, Goodfellow et al. published a paper entitled Explaining and Harnessing Adversarial Examples, which showed an intriguing property of deep neural networks — it’s possible to purposely perturb an input image such that the neural network misclassifies it. This type of perturbation is called an adversarial attack.

The classic example of an adversarial attack can be seen in Figure 2 above. On the left, we have our input image which our neural network correctly classifies as “panda” with 57.7% confidence.

In the middle, we have a noise vector, which to the human eye, appears to be random. However, it’s far from random.

Instead, the pixels in noise vector are “equal to the sign of the elements of the gradient of the cost function with the respect to the input image” (Goodfellow et al.).

We then add this noise vector to the input image, which produces the output (right) in Figure 2. To us, this image appears identical to the input; however, our neural network now classifies the image as a “gibbon” (a small ape, similar to a monkey) with 99.7% confidence.

Creepy, right?

A brief history of adversarial attacks and images

**Figure 3:** A timeline of adversarial machine learning and security of deep neural network publications (*Image source*: Figure 8 of *Can Machine Learning Be Secure?*)

Adversarial machine learning is not a new field, nor are these attacks specific to deep neural networks. In 2006, Barreno et al. published a paper entitled Can Machine Learning Be Secure? This paper discussed adversarial attacks, including proposed defenses against them.

Back in 2006, the top state-of-the-art machine learning models included Support Vector Machines (SVMs) and Random Forests (RFs) — it’s been shown that both these types of models are susceptible to adversarial attacks.

With the rise in popularity of deep neural networks starting in 2012, it was hoped that these highly non-linear models would be less susceptible to attacks; however, Goodfellow et al. (among others) dashed these hopes.

It turns out that deep neural networks are susceptible to adversarial attacks, just like their predecessors.

For more information on the history of adversarial attacks, I recommend reading Biggio and Roli’s excellent 2017 paper, Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning.

Why are adversarial attacks and images a problem?

**Figure 4:** Why are adversarial attacks such a problem? Why should we be concerned? (*image source*)

The example at the top of this tutorial outlined why adversarial attacks could cause massive damage to health, life, and property.

Examples with less severe consequences could be a group of hackers identifies that a specific model is being used by Google for spam filtering in Gmail, or a given model is being used by Facebook to automatically detect pornography in their NSFW filter.

If these hackers wanted to flood Gmail users with emails that bypass Gmail’s spam filters, or upload massive amounts of pornography to Facebook that bypasses their NSFW filters, they could theoretically do so.

These are all examples of adversarial attacks with less consequences.

An adversarial attack in a scenario with higher consequences could include hacker-terrorists identifying that a specific deep neural network is being used for nearly all self-driving cars in the world (imagine if Tesla had a monopoly on the market and was the only self-driving car producer).

Adversarial images could then be strategically placed along roads and highways, causing massive pileups, property damage, and even injury/death to passengers in the vehicles.

The limit to adversarial attacks is only limited by your imagination, your knowledge of a given model, and how much access you have to the model itself.

Can we defend against adversarial attacks?

The good news is that we can help reduce the impact of adversarial attacks (but not necessarily eliminate them completely).

That topic won’t be covered in today’s tutorial, but will be covered in a future tutorial on PyImageSearch.

Configuring your development environment

To configure your system for this tutorial, I recommend following either of these tutorials:

Either tutorial will help you configure your system with all the necessary software for this blog post in a convenient Python virtual environment.

That said, are you:

Short on time?
Learning on your employer’s administratively locked laptop?
Wanting to skip the hassle of fighting with package managers, bash/ZSH profiles, and virtual environments?
Ready to run the code right now (and experiment with it to your heart’s content)?

Then join PyImageSearch Plus today! Gain access to PyImageSearch tutorial Jupyter Notebooks that run on Google’s Colab ecosystem in your browser — no installation required!

Project structure

Start by using the “Downloads” section of this tutorial to download the source code and example images. From there, let’s inspect our project directory structure.

$ tree --dirsfirst
.
├── pyimagesearch
│   ├── __init__.py
│   ├── imagenet_class_index.json
│   └── utils.py
├── adversarial.png
├── generate_basic_adversary.py
├── pig.jpg
└── predict_normal.py

1 directory, 7 files

Inside the pyimagesearch module, we have two files:

imagenet_class_index.json: A JSON file, which maps ImageNet class labels to human-readable strings. We’ll be using this JSON file to determine the integer index for a particular class label — this integer index will aid us when we construct our adversarial image attack.
utils.py: Contains a simple Python helper function used to load and parse the imagenet_class_index.json.

We then have two Python scripts that we’ll be reviewing today:

predict_normal.py: Accepts an input image (pig.jpg), loads our ResNet50 model, and classifies it. The output of this script will be the ImageNet class label index of the predicted class label.
generate_basic_adversary.py: Using the output of our predict_normal.py script, we’ll construct an adversarial attack that is able to fool ResNet. The output of this script (adversarial.png) will be saved to disk.

Ready to implement your first adversarial attack with Keras and TensorFlow?

Let’s dive in.

Our ImageNet class label/index helper utility

Before we can perform either normal image classification or classification with an image perturbed via an adversarial attack, we first need to create a Python helper function used to load and parse the class labels of the ImageNet dataset.

We have provided a JSON file that contains the ImageNet class label indexes, identifiers, and human-readable strings inside the imagenet_class_index.json file in the pyimagesearch module of our project directory structure.

I’ve included the first few lines of this JSON file below:

{
  "0": [
    "n01440764",
    "tench"
  ],
  "1": [
    "n01443537",
    "goldfish"
  ],
  "2": [
    "n01484850",
    "great_white_shark"
  ],
  "3": [
    "n01491361",
    "tiger_shark"
  ],
...
"106": [
    "n01883070",
    "wombat"
  ],
...

Here you can see that the file is a dictionary. The key to the dictionary is the integer class label index, while the value is 2-tuple consisting of:

The ImageNet unique identifier for the label
The human-readable class label

Our goal is to implement a Python function that will parse the JSON file by:

Accepting an input class label
Returning the integer class label index of the corresponding label

Essentially, we are inverting the key/value relationship in the imagenet_class_index.json file.

Let’s start implementing our helper function now.

Open up the utils.py file in the pyimagesearch module, and insert the following code:

# import necessary packages
import json
import os

def get_class_idx(label):
	# build the path to the ImageNet class label mappings file
	labelPath = os.path.join(os.path.dirname(__file__),
		"imagenet_class_index.json")

Lines 2 and 3 import our required Python packages. We’ll be using the json Python module to load our JSON file, while the os package will be used to construct file paths, agnostic of which operating system you are using.

We then define our get_class_idx helper function. The goal of this function is to accept an input class label and then obtain the integer index of the prediction (i.e., which index out of the 1,000 class labels that a model trained on ImageNet would be able to predict).

Line 7 constructs the path to the imagenet_class_index.json, which lives inside the pyimagesearch module.

Let’s load the contents of that JSON file now:

	# open the ImageNet class mappings file and load the mappings as
	# a dictionary with the human-readable class label as the key and
	# the integer index as the value
	with open(labelPath) as f:
		imageNetClasses = {labels[1]: int(idx) for (idx, labels) in
			json.load(f).items()}

	# check to see if the input class label has a corresponding
	# integer index value, and if so return it; otherwise return
	# a None-type value
	return imageNetClasses.get(label, None)

Lines 13-15 open the labelPath file and proceed to invert the key/value relationship such that the key is the human-readable label string and the value is the integer index that corresponds to that label.

In order to obtain the integer index for the input label, we make a call to the .get method of the imageNetClasses dictionary (Line 20) — this call will return either:

The integer index of the label (if it exists in the dictionary)
And if the label does not exist in imageNetClasses, it will return None

This value is then returned to the calling function.

Let’s put our get_class_idx helper function to work in the following section.

**Normal image classification without adversarial attacks using Keras and TensorFlow**

With our ImageNet class label/index helper function implemented, let’s first create an image classification script that performs basic classification with no adversarial attacks.

This script will demonstrate that our ResNet model is performing as we would it expect it to (i.e., making correct predictions). Later in this tutorial, you’ll discover how to construct an adversarial image such that it confuses ResNet.

Let’s get started with our basic image classification script — open up the predict_normal.py file in your project directory structure, and insert the following code:

# import necessary packages
from pyimagesearch.utils import get_class_idx
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.applications.resnet50 import decode_predictions
from tensorflow.keras.applications.resnet50 import preprocess_input
import numpy as np
import argparse
import imutils
import cv2

We import our required Python packages on Lines 2-9. These will all look fairly standard to you if you’ve ever worked with Keras, TensorFlow, and OpenCV before.

That said, if you are new to Keras and TensorFlow, I strongly encourage you to read my Keras Tutorial: How to get started with Keras, Deep Learning, and Python guide. Additionally, you may want to read my book Deep Learning for Computer Vision with Python to obtain a deeper understanding of how to train your own custom neural networks.

With all that said, take notice of Line 2, where we import our get_class_idx function, which we defined in the previous section — this function will allow us to obtain the integer index of the top predicted label from our ResNet50 model.

Let’s move on to defining our preprocess_image helper function:

def preprocess_image(image):
	# swap color channels, preprocess the image, and add in a batch
	# dimension
	image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
	image = preprocess_input(image)
	image = cv2.resize(image, (224, 224))
	image = np.expand_dims(image, axis=0)

	# return the preprocessed image
	return image

The preprocess_image method accepts a single required argument, the image that we wish to preprocess.

We preprocess the image by:

Swapping the image from BGR to RGB channel ordering
Calling the preprocess_input image function, which performs ResNet50-specific preprocessing and scaling
Resizing the image to 224×224
Adding in a batch dimension

The preprocessed image is then returned to the calling function.

Next, let’s parse our command line arguments:

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to input image")
args = vars(ap.parse_args())

We only need a single command line argument here, --image, which is the path to our input image residing on disk.

If you’ve never worked with command line arguments and argparse before, I suggest you read the following tutorial.

Let’s now load our input image from disk and preprocess it:

# load image from disk and make a clone for annotation
print("[INFO] loading image...")
image = cv2.imread(args["image"])
output = image.copy()

# preprocess the input image
output = imutils.resize(output, width=400)
preprocessedImage = preprocess_image(image)

A call to cv2.imread loads our input image from disk. We clone it on Line 31 so we can later draw on it/annotate it with the final output class label prediction.

We resize the output image to have a width of 400 pixels, such that it fits on our screen. We also call our preprocess_image function on the input image to prepare it for classification by ResNet.

With our image preprocessed, we can load ResNet and classify the image:

# load the pre-trained ResNet50 model
print("[INFO] loading pre-trained ResNet50 model...")
model = ResNet50(weights="imagenet")

# make predictions on the input image and parse the top-3 predictions
print("[INFO] making predictions...")
predictions = model.predict(preprocessedImage)
predictions = decode_predictions(predictions, top=3)[0]

On Line 39 we load ResNet from disk with weights pre-trained on the ImageNet dataset.

Lines 43 and 44 make predictions on our pre-procssed image, which we then decode using the decode_predictions helper function in Keras/TensorFlow.

Let’s now loop over the top-3 predictions from the network and display the class labels:

# loop over the top three predictions
for (i, (imagenetID, label, prob)) in enumerate(predictions):
	# print the ImageNet class label ID of the top prediction to our
	# terminal (we'll need this label for our next script which will
	# perform the actual adversarial attack)
	if i == 0:
		print("[INFO] {} => {}".format(label, get_class_idx(label)))

	# display the prediction to our screen
	print("[INFO] {}. {}: {:.2f}%".format(i + 1, label, prob * 100))

Line 47 begins a loop over the top-3 predictions.

If this is the first prediction (i.e., the top-1 prediction), we display the human-readable label to our terminal and then look up the ImageNet integer index of the corresponding label using our get_class_idx function.

We also display the top-3 labels and corresponding probability to our terminal.

The final step is to draw the top-1 prediction on the output image:

# draw the top-most predicted label on the image along with the
# confidence score
text = "{}: {:.2f}%".format(predictions[0][1],
	predictions[0][2] * 100)
cv2.putText(output, text, (3, 20), cv2.FONT_HERSHEY_SIMPLEX, 0.8,
	(0, 255, 0), 2)

# show the output image
cv2.imshow("Output", output)
cv2.waitKey(0)

The output image is displayed to our terminal until the window opened by OpenCV is clicked on and a key pressed.

Non-adversarial image classification results

We are now ready to perform basic image classification (i.e., no adversarial attack) with ResNet.

Start by using the “Downloads” section of this tutorial to download the source code and example images.

From there, open up a terminal and execute the following command:

$ python predict_normal.py --image pig.jpg
[INFO] loading image...
[INFO] loading pre-trained ResNet50 model...
[INFO] making predictions...
[INFO] hog => 341
[INFO] 1. hog: 99.97%
[INFO] 2. wild_boar: 0.03%
[INFO] 3. piggy_bank: 0.00%

**Figure 5:** Our pre-trained ResNet model is able to correctly classify this image as *“hog”.*

Here you can see that we have classified an input image of a pig, with 99.97% confidence.

Additionally, take note of the “hog” ImageNet label ID (341) — we’ll be using this class label ID in the next section, where we will perform an adversarial attack on the hog input image.

Implementing adversarial images and attacks with Keras and TensorFlow

We will now learn how to implement adversarial attacks with Keras and TensorFlow.

Open up the generate_basic_adversary.py file in our project directory structure, and insert the following code:

# import necessary packages
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.losses import SparseCategoricalCrossentropy
from tensorflow.keras.applications.resnet50 import decode_predictions
from tensorflow.keras.applications.resnet50 import preprocess_input
import tensorflow as tf
import numpy as np
import argparse
import cv2

We start by importing our required Python packages on Lines 2-10. You’ll notice that we are once again using the ResNet50 architecture with its corresponding preprocess_input function (for preprocessing/scaling input images) and decode_predictions utility to decode output predictions and display the human-readable ImageNet labels.

The SparseCategoricalCrossentropy computes the categorical cross-entropy loss between the labels and predictions. By using the sparse version implementation of categorical cross-entropy, we do not have to explicitly one-hot encode our class labels like we would if we were using scikit-learn’s LabelBinarizer or Keras/TensorFlow’s to_categorical utility.

Just like we had a preprocess_image utility in our predict_normal.py script, we also need one for this script as well:

def preprocess_image(image):
	# swap color channels, resize the input image, and add a batch
	# dimension
	image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
	image = cv2.resize(image, (224, 224))
	image = np.expand_dims(image, axis=0)

	# return the preprocessed image
	return image

This implementation is identical to the one above with the exception of leaving out the preprocess_input function call — you’ll see why we are leaving out that call once we start constructing our adversarial image.

Next up, we have a simple helper utility, clip_eps:

def clip_eps(tensor, eps):
	# clip the values of the tensor to a given range and return it
	return tf.clip_by_value(tensor, clip_value_min=-eps,
		clip_value_max=eps)

The goal of this function is to accept an input tensor and then clip any values inside the input to the range [-eps, eps].

The clipped tensor is then returned to the calling function.

We now arrive at the generate_adversaries function, which is the meat of our adversarial attack:

def generate_adversaries(model, baseImage, delta, classIdx, steps=50):
	# iterate over the number of steps
	for step in range(0, steps):
		# record our gradients
		with tf.GradientTape() as tape:
			# explicitly indicate that our perturbation vector should
			# be tracked for gradient updates
			tape.watch(delta)

The generate_adversaries method is the workhorse of our script. This function accepts four required parameters, including an optional fifth one:

model: Our ResNet50 model (you could swap in a different pre-trained model such as VGG16, MobileNet, etc. if you prefer).
baseImage: The original non-perturbed input image that we wish to construct an adversarial attack for, causing our model to misclassify it.
delta: Our noise vector, which will be added to the baseImage , ultimately causing the misclassification. We’ll update this delta vector by means of gradient descent.
classIdx: The integer class label index we obtained by running the predict_normal.py script.
steps: Number of gradient descent steps to perform (defaults to 50 steps).

Line 29 starts a loop over our number of steps.

We then use GradientTape to record our gradients. Calling the .watch method of the tape explicitly indicates that our perturbation vector should be tracked for updates.

We can now construct our adversarial image:

			# add our perturbation vector to the base image and
			# preprocess the resulting image
			adversary = preprocess_input(baseImage + delta)

			# run this newly constructed image tensor through our
			# model and calculate the loss with respect to the
			# *original* class index
			predictions = model(adversary, training=False)
			loss = -sccLoss(tf.convert_to_tensor([classIdx]),
				predictions)

			# check to see if we are logging the loss value, and if
			# so, display it to our terminal
			if step % 5 == 0:
				print("step: {}, loss: {}...".format(step,
					loss.numpy()))

		# calculate the gradients of loss with respect to the
		# perturbation vector
		gradients = tape.gradient(loss, delta)

		# update the weights, clip the perturbation vector, and
		# update its value
		optimizer.apply_gradients([(gradients, delta)])
		delta.assign_add(clip_eps(delta, eps=EPS))

	# return the perturbation vector
	return delta

Line 38 constructs our adversary image by adding the delta perturbation vector to the baseImage. The result of this adding is passed through ResNet50’s preprocess_input function to scale and normalize the resulting adversarial image.

From there, the following takes place:

Line 43 takes our model and makes predictions on the newly constructed adversary.
Lines 44 and 45 calculate the loss with respect to the original classIdx (i.e., the integer index of the top-1 ImageNet class label, which we obtained by running predict_normal.py).
Lines 49-51 show our resulting loss every five steps.

Outside of the with statement now, we calculate the gradients of the loss with respect to our perturbation vector (Line 55).

We can then update the delta vector and clip and values that fall outside the [-EPS, EPS] range.

Finally, we return the resulting perturbation vector to the calling function — the final delta value will allow us to construct the adversarial attack used to fool our model.

With the workhorse of our adversarial script implemented, let’s move on to parsing our command line arguments:

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--input", required=True,
	help="path to original input image")
ap.add_argument("-o", "--output", required=True,
	help="path to output adversarial image")
ap.add_argument("-c", "--class-idx", type=int, required=True,
	help="ImageNet class ID of the predicted label")
args = vars(ap.parse_args())

Our adversarial attack Python script requires three command line arguments:

--input: The path to the input image (i.e., pig.jpg) residing on disk.
--output: The output adversarial image after constructing the attack (adversarial.png)
--class-idx: The integer class label index from the ImageNet dataset. We obtained this value by running predict_normal.py in the “Non-adversarial image classification results” section of this tutorial.

We can now perform a couple of initializations and load/preprocess our --input image:

# define the epsilon and learning rate constants
EPS = 2 / 255.0
LR = 0.1

# load the input image from disk and preprocess it
print("[INFO] loading image...")
image = cv2.imread(args["input"])
image = preprocess_image(image)

Line 76 defines our epsilon (EPS) value used for clipping tensors when constructing the adversarial image. An EPS value of 2 / 255.0 is a standard value used in adversarial publications and tutorials (the following guide is also helpful if you’re interested in learning more about this “default” value).

We then define our learning rate on Line 77. A value of LR = 0.1 was obtained by empirical tuning — you may need to update this value when constructing your own adversarial images.

Lines 81 and 82 load our input image from disk and preprocess it using our preprocess_image helper function.

Next, we can load our ResNet model:

# load the pre-trained ResNet50 model for running inference
print("[INFO] loading pre-trained ResNet50 model...")
model = ResNet50(weights="imagenet")

# initialize optimizer and loss function
optimizer = Adam(learning_rate=LR)
sccLoss = SparseCategoricalCrossentropy()

Line 86 loads the ResNet50 model, pre-trained on the ImageNet dataset.

We’ll use the Adam optimizer, along with the sparse categorical-loss implementation, when updating our perturbation vector.

Let’s now construct our adversarial image:

# create a tensor based off the input image and initialize the
# perturbation vector (we will update this vector via training)
baseImage = tf.constant(image, dtype=tf.float32)
delta = tf.Variable(tf.zeros_like(baseImage), trainable=True)

# generate the perturbation vector to create an adversarial example
print("[INFO] generating perturbation...")
deltaUpdated = generate_adversaries(model, baseImage, delta,
	args["class_idx"])

# create the adversarial example, swap color channels, and save the
# output image to disk
print("[INFO] creating adversarial example...")
adverImage = (baseImage + deltaUpdated).numpy().squeeze()
adverImage = np.clip(adverImage, 0, 255).astype("uint8")
adverImage = cv2.cvtColor(adverImage, cv2.COLOR_RGB2BGR)
cv2.imwrite(args["output"], adverImage)

Line 94 constructs a tensor from our input image, while Line 95 initializes delta, our perturbation vector.

To actually construct and update the delta vector, we make a call to generate_adversaries, passing in our ResNet50 model, input image, perturbation vector, and integer class label index.

The generate_adversaries function runs, updating the delta pertubration vector along the way, resulting in deltaUpdated, the final noise vector.

We construct our final adversarial image (adverImage) on Line 105 by adding the deltaUpdated vector to baseImage.

Afterward, we proceed to post-process the resulting adversarial image by:

Clipping any values that fall outside the range [0, 255]
Converting the image to an unsigned 8-bit integer (so that OpenCV can now operate on the image)
Swapping color channel ordering from RGB to BGR

After the above preprocessing steps, we write the output adversarial image to disk.

The real question is, can our newly constructed adversarial image fool our ResNet model?

The next code block will address that question:

# run inference with this adversarial example, parse the results,
# and display the top-1 predicted result
print("[INFO] running inference on the adversarial example...")
preprocessedImage = preprocess_input(baseImage + deltaUpdated)
predictions = model.predict(preprocessedImage)
predictions = decode_predictions(predictions, top=3)[0]
label = predictions[0][1]
confidence = predictions[0][2] * 100
print("[INFO] label: {} confidence: {:.2f}%".format(label,
	confidence))

# draw the top-most predicted label on the adversarial image along
# with the confidence score
text = "{}: {:.2f}%".format(label, confidence)
cv2.putText(adverImage, text, (3, 20), cv2.FONT_HERSHEY_SIMPLEX, 0.5,
	(0, 255, 0), 2)

# show the output image
cv2.imshow("Output", adverImage)
cv2.waitKey(0)

We once again construct our adversarial image on Line 113 by adding the delta noise vector to our original input image, but this time we call ResNet’s preprocess_input utility on it.

The resulting preprocessed image is passed through ResNet, after which we grab the top-3 predictions and decode them (Lines 114 and 115).

We then grab the label and corresponding probability/confidence with the top-1 prediction and display these values to our terminal (Lines 116-119).

The final step is to draw the top prediction on our output adversarial image and display it to our screen.

Results of adversarial images and attacks

Ready to see an adversarial attack in action?

Make sure you used the “Downloads” section of this tutorial to download the source code and example images.

From there, you can open up a terminal and execute the following command:

$ python generate_basic_adversary.py --input pig.jpg --output adversarial.png --class-idx 341
[INFO] loading image...
[INFO] loading pre-trained ResNet50 model...
[INFO] generating perturbation...
step: 0, loss: -0.0004124982515349984...
step: 5, loss: -0.0010656398953869939...
step: 10, loss: -0.005332294851541519...
step: 15, loss: -0.06327803432941437...
step: 20, loss: -0.7707189321517944...
step: 25, loss: -3.4659299850463867...
step: 30, loss: -7.515471935272217...
step: 35, loss: -13.503922462463379...
step: 40, loss: -16.118188858032227...
step: 45, loss: -16.118192672729492...
[INFO] creating adversarial example...
[INFO] running inference on the adversarial example...
[INFO] label: wombat confidence: 100.00%

**Figure 6:** Previously, this input image was correctly classified as *“hog”* but is now classified as *“wombat”* due to our adversarial attack!

Our input pig.jpg, which was correctly classified as “hog” in the previous section is now labeled as a “wombat”!

I’ve placed the original pig.jpg image next to the adversarial image generated by our generate_basic_adversary.py script below:

**Figure 7:** On the *left,* we have our original input image, which is correctly classified. On the *right,* we have our output adversarial image, which is incorrectly classified as *“wombat”* — the human eye is unable to spot any differences between these images.

On the left is the original hog image, while on the right we have the output adversarial image, which is incorrectly classified as a “wombat”.

As you can see, there is no perceptible difference between the two images — our human eyes can see the difference between these two images, but to ResNet, they are totally different.

That’s all well and good, but we clearly don’t have control over the final class label in the adversarial image. That raises the question:

Is it possible to control what the final output class label of the input image is? The answer is yes — and I’ll be covering that question in next week’s tutorial.

I’ll conclude by saying that it’s easy to get scared of adversarial images and adversarial attacks if you let your imagination get the best of you. But as we’ll see in a later tutorial on PyImageSearch, we can actually defend against these types of attacks. More on that later.

Credits

This tutorial would not have been possible without the research of Goodfellow, Szegedy, and many other deep learning researchers.

Additionally, I want to call out that the implementation used in today’s tutorial is inspired by TensorFlow’s official implementation of the Fast Gradient Sign Method. I strongly suggest you take a look at their example, which does a fantastic job explaining the more theoretical and mathematically motivated aspects of this tutorial.

What's next? I recommend PyImageSearch University.

Course information:
30+ total classes • 39h 44m video • Last updated: 12/2021
★★★★★ 4.84 (128 Ratings) • 3,000+ Students Enrolled

I strongly believe that if you had the right teacher you could master computer vision and deep learning.

Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?

That’s not the case.

All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.

If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.

Inside PyImageSearch University you'll find:

✓ 30+ courses on essential computer vision, deep learning, and OpenCV topics
✓ 30+ Certificates of Completion
✓ 39h 44m on-demand video
✓ Brand new courses released every month, ensuring you can keep up with state-of-the-art techniques
✓ Pre-configured Jupyter Notebooks in Google Colab
✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
✓ Access to centralized code repos for all 500+ tutorials on PyImageSearch
✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
✓ Access on mobile, laptop, desktop, etc.

Click here to join PyImageSearch University

Summary

In this tutorial, you learned about adversarial attacks, how they work, and the threat they pose to a world becoming more and more reliant on Artificial Intelligence and deep neural networks.

We then implemented a basic adversarial attack algorithm using the Keras and TensorFlow deep learning libraries.

Using adversarial attacks, we can purposely perturb an input image such that:

The input image is misclassified
However, to the human eye, the perturbed image looks identical to the original

However, using the method applied here today, we have absolutely no control over what the final class label of the image is — all we’re doing is creating and embedding a noise vector that causes the deep neural network to misclassify the image.

But what if we could control what the final target class label is? For example, is it possible to take an image of a “dog” and construct an adversarial attack such that the Convolutional Neural Network thinks the image is a “cat”?

The answer is yes — and we’ll be covering that exact same topic in next week’s tutorial.

To download the source code to this post (and be notified when future tutorials are published here on PyImageSearch), simply enter your email address in the form below!

Download the Source Code and FREE 17-page Resource Guide

Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!

Looking for the source code to this post?

Adversarial images and attacks with Keras and TensorFlow

What are adversarial images and adversarial attacks? And how to they impact deep learning models?

A brief history of adversarial attacks and images

Why are adversarial attacks and images a problem?

Can we defend against adversarial attacks?

Configuring your development environment

Project structure

Our ImageNet class label/index helper utility

**Normal image classification without adversarial attacks using Keras and TensorFlow**

Non-adversarial image classification results

Implementing adversarial images and attacks with Keras and TensorFlow

Results of adversarial images and attacks

Credits

What's next? I recommend PyImageSearch University.

Summary

Download the Source Code and FREE 17-page Resource Guide

About the Author

Comment section

Remote development on the Raspberry Pi (or Amazon EC2)

Image Difference with OpenCV and Python

An interview with David Austin: 1st place and $25,000 in Kaggle’s most popular image classification competition

Topics

Books & Courses

PyImageSearch

Looking for the source code to this post?

Adversarial images and attacks with Keras and TensorFlow

What are adversarial images and adversarial attacks? And how to they impact deep learning models?

A brief history of adversarial attacks and images

Why are adversarial attacks and images a problem?

Can we defend against adversarial attacks?

Configuring your development environment

Project structure

Our ImageNet class label/index helper utility

Normal image classification without adversarial attacks using Keras and TensorFlow

Non-adversarial image classification results

Implementing adversarial images and attacks with Keras and TensorFlow

Results of adversarial images and attacks

Credits

What's next? I recommend PyImageSearch University.

Summary

Download the Source Code and FREE 17-page Resource Guide

About the Author

Multi-class object detection and bounding box regression with Keras, TensorFlow, and Deep Learning

Targeted adversarial attacks with Keras and TensorFlow

Comment section

Similar articles

You can learn Computer Vision, Deep Learning, and OpenCV.

Footer

Topics

Books & Courses

PyImageSearch

Access the code to this tutorial and all other 500+ tutorials on PyImageSearch

What's included in PyImageSearch University?

**Normal image classification without adversarial attacks using Keras and TensorFlow**