In this tutorial, you will learn how to use OpenCV and the cv2.threshold
function to apply basic thresholding and Otsu thresholding.
Thresholding is one of the most common (and basic) segmentation techniques in computer vision and it allows us to separate the foreground (i.e., the objects that we are interested in) from the background of the image.
Thresholding comes in three forms:
- We have simple thresholding where we manually supply parameters to segment the image — this works extremely well in controlled lighting conditions where we can ensure high contrast between the foreground and background of the image.
- We also have methods such as Otsu’s thresholding that attempt to be more dynamic and automatically compute the optimal threshold value based on the input image.
- And finally we have adaptive thresholding which, instead of trying to threshold an image globally using a single value, instead breaks the image down into smaller pieces, and thresholds each of these pieces separately and individually.
We’ll be discussing simple thresholding and Otsu’s thresholding here today. Our next tutorial will cover adaptive thresholding in detail.
To learn how to apply basic thresholding and Otsu thresholding with OpenCV and the cv2.threshold
function, just keep reading.
Looking for the source code to this post?
Jump Right To The Downloads SectionOpenCV Thresholding ( cv2.threshold )
In the first part of this tutorial, we’ll discuss the concept of thresholding and how thresholding can help us segment images using OpenCV.
From there we’ll configure our development environment and review our project directory structure.
I’ll then show you two methods to threshold an image using OpenCV:
- Basic thresholding where you have to manually supply a threshold value, T
- Otsu’s thresholding, which automatically determines the threshold value
As a computer vision practitioner it’s critical that you understand how these methods work.
Let’s get started.
What is thresholding?
Thresholding is the binarization of an image. In general, we seek to convert a grayscale image to a binary image, where the pixels are either 0 or 255.
A simple thresholding example would be selecting a threshold value T, and then setting all pixel intensities less than T to 0, and all pixel values greater than T to 255. In this way, we are able to create a binary representation of the image.
For example, take a look at the (grayscale) PyImageSearch logo below and its thresholded counterpart:
On the left, we have the original PyImageSearch logo that has been converted to grayscale. And on the right, we have the thresholded, binary representation of the PyImageSearch logo.
To construct this thresholded image I simply set my threshold value T=225. That way, all pixels p in the logo where p < T are set to 255, and all pixels p >= T are set to 0.
By performing this thresholding I have been able to segment the PyImageSearch logo from the background.
Normally, we use thresholding to focus on objects or areas of particular interest in an image. In the examples in the lesson below, we will be using thresholding to detect coins in images, segment the pieces of the OpenCV logo, and separate license plate letters and characters from the license plate itself.
Configuring your development environment
To follow this guide, you need to have the OpenCV library installed on your system.
Luckily, OpenCV is pip-installable:
$ pip install opencv-contrib-python
If you need help configuring your development environment for OpenCV, I highly recommend that you read my pip install OpenCV guide — it will have you up and running in a matter of minutes.
Having problems configuring your development environment?
All that said, are you:
- Short on time?
- Learning on your employer’s administratively locked system?
- Wanting to skip the hassle of fighting with the command line, package managers, and virtual environments?
- Ready to run the code right now on your Windows, macOS, or Linux systems?
Then join PyImageSearch University today!
Gain access to Jupyter Notebooks for this tutorial and other PyImageSearch guides that are pre-configured to run on Google Colab’s ecosystem right in your web browser! No installation required.
And best of all, these Jupyter Notebooks will run on Windows, macOS, and Linux!
Project structure
Before we can apply thresholding using OpenCV and the cv2.threshold
function, we first need to review our project directory structure.
Start by accessing the “Downloads” section of this tutorial to retrieve the source code and example images.
You’ll then be presented with the following directory structure:
$ tree . --dirsfirst . ├── images │ ├── coins01.png │ ├── coins02.png │ └── opencv_logo.png ├── otsu_thresholding.py └── simple_thresholding.py 1 directory, 5 files
We have two Python scripts to review today:
simple_thresholding.py
: Demonstrates how to apply thresholding with OpenCV. Here we manually set thresholds such that we can segment our foreground from the background.otsu_thresholding.py
: Applies Otsu’s thresholding method such that the threshold parameter is set automatically.
The benefit of Otsu’s thresholding technique is that we don’t have to fiddle with manually setting the threshold cutoff — Otsu’s method will do that automatically for us.
Inside the images
directory are a number of demo images that we’ll apply these thresholding scripts to.
Implementing simple thresholding with OpenCV
Applying simple thresholding methods requires human intervention. We must specify a threshold value T. All pixel intensities below T are set to 255. And all pixel intensities greater than T are set to 0.
We could also apply the inverse of this binarization by setting all pixels greater than T to 255 and all pixel intensities below T to 0.
Let’s explore some code to apply simple thresholding methods. Open the simple_thresholding.py
file in your project directory structure and insert the following code:
# import the necessary packages import argparse import cv2 # construct the argument parser and parse the arguments ap = argparse.ArgumentParser() ap.add_argument("-i", "--image", type=str, required=True, help="path to input image") args = vars(ap.parse_args())
We start on Lines 2 and 3 by importing our required Python packages. We then parse our command line arguments on Lines 6-9.
Only a single command line argument is needed, --image
, which is the path to the input image we wish to apply thresholding to.
With our imports and command line arguments taken care of, let’s move on to loading our image from disk and preprocessing it:
# load the image and display it image = cv2.imread(args["image"]) cv2.imshow("Image", image) # convert the image to grayscale and blur it slightly gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) blurred = cv2.GaussianBlur(gray, (7, 7), 0)
Lines 12 and 13 load our input image
from disk and display it on our screen.
We then preprocess the image by both:
- Converting it to grayscale
- Applying a 7×7 Gaussian blur
Applying Gaussian blurring helps remove some of the high frequency edges in the image that we are not concerned with and allow us to obtain a more “clean” segmentation.
Now, let’s go ahead and apply the actual thresholding:
# apply basic thresholding -- the first parameter is the image # we want to threshold, the second value is is our threshold # check; if a pixel value is greater than our threshold (in this # case, 200), we set it to be *black, otherwise it is *white* (T, threshInv) = cv2.threshold(blurred, 200, 255, cv2.THRESH_BINARY_INV) cv2.imshow("Threshold Binary Inverse", threshInv)
After the image is blurred, we compute the thresholded image on Lines 23 and 24 using the cv2.threshold
function. This method requires four arguments.
The first is the grayscale image that we wish to threshold. We supply our blurred
image as the first.
Then, we manually supply our T threshold value. We use a value of T=200.
Our third argument is the output value applied during thresholding. Any pixel intensity p that is greater than T is set to zero and any p that is less than T is set to the output value:
In our example, any pixel value that is greater than 200 is set to 0. Any value that is less than 200 is set to 255.
Finally, we must provide a thresholding method. We use the cv2.THRESH_BINARY_INV
method, which indicates that pixel values p less than T are set to the output value (the third argument).
The cv2.threshold
function then returns a tuple of 2 values: the first, T, is the threshold value. In the case of simple thresholding, this value is trivial since we manually supplied the value of T in the first place. But in the case of Otsu’s thresholding where T is dynamically computed for us, it’s nice to have that value. The second returned value is the thresholded image itself.
But what if we wanted to perform the reverse operation, like this:
What if wanted to set all pixels p greater than T to the output value? Is that possible?
Of course! And there are two ways to do it. The first method is to simply take the bitwise NOT of the output threshold image. But that adds an extra line of code.
Instead, we can just supply a different flag to the cv2.threshold
function:
# using normal thresholding (rather than inverse thresholding) (T, thresh) = cv2.threshold(blurred, 200, 255, cv2.THRESH_BINARY) cv2.imshow("Threshold Binary", thresh)
On Line 28 we apply a different thresholding method by supplying cv2.THRESH_BINARY
.
In most cases you are normally seeking your segmented objects to appear as white on a black background, hence using cv2.THRESH_BINARY_INV
. But in the case that you want your objects to appear as black on a white background, be sure to supply the cv2.THRESH_BINARY
flag.
The last task we are going to perform is to reveal the foreground objects in the image and hide everything else. Remember when we discussed image masking? That will come in handy here:
# visualize only the masked regions in the image masked = cv2.bitwise_and(image, image, mask=threshInv) cv2.imshow("Output", masked) cv2.waitKey(0)
On Line 32, we perform masking by using the cv2.bitwise_and
function. We supply our original input image as the first two arguments, and then our inverted thresholded image as our mask. Remember, a mask only considers pixels in the original image where the mask is greater than zero.
Simple thresholding results
Ready to see the results of applying basic thresholding with OpenCV?
Start by accessing the “Downloads” section of this tutorial to retrieve the source code and example images.
From there, you can execute the following command:
$ python simple_thresholding.py --image images/coins01.png
On the top-left, we have our original input image. And on the top-right, we have the segmented image using inverse thresholding, where the coins appear as white on a black background.
Similarly, on the bottom-left we flip the thresholding method and now the coins appear as black on a white background.
Finally, the bottom-right applies our bitwise AND with the threshold mask and we are left with just the coins in the image (no background).
Let’s try a second image of coins:
$ python simple_thresholding.py --image images/coins02.png
Once again we are able to successfully segment the foreground of the image from the background.
But take a close look and compare the output of Figure 5 and Figure 6. You’ll notice that in Figure 5 there are some coins that appear to have “holes” in them. This is because the thresholding test failed to pass — and thus we could not include that region of the coin in the output thresholded image.
However, in Figure 6 you’ll notice that there are no holes — indicating that the segmentation is (essentially) perfect.
Note: Realistically, this isn’t a problem. All that really matters is that we are able to obtain the contour or outline of the coins. These small gaps inside the thresholded coin mask can be filled in using morphological operations or contour methods.
So given that the thresholding worked perfectly in Figure 6, why did it not work just as perfectly in Figure 5?
The answer is simple: lighting conditions.
While subtle, these two images were indeed captured under different lighting conditions. And since we have manually supplied a thresholding value, there is no guarantee that this threshold value T is going to work from one image to the next in the presence of lighting changes.
One method to combat this is to simply provide a threshold value T for each image you want to threshold. But that’s a serious problem, especially if we want our system to be dynamic and work under various lighting conditions.
The solution is to use methods such as Otsu’s method and adaptive thresholding to aid us in obtaining better results.
But for the time being, let’s look at one more example where we segment the pieces of the OpenCV logo:
$ python simple_thresholding.py --image images/opencv_logo.png
Notice how we have been able to segment the semicircles of the OpenCV logo along with the “OpenCV” text itself from the input image. While this may not look very interesting, being able to segment an image into pieces is an extremely valuable skill to have. This will become more apparent when we dive into contours and use them to quantify and identify different objects in an image.
But for the time being, let’s move on to some more advanced thresholding techniques where we do not have to manually supply a value of T.
Implementing Otsu thresholding with OpenCV
In the previous section on simple thresholding we needed to manually supply a threshold value of T. For simple images in controlled lighting conditions, it might be feasible for us to hardcode this value.
But in real-world conditions where we do not have any a priori knowledge of the lighting conditions, we actually automatically compute an optimal value of T using Otsu’s method.
Otsu’s method assumes that our image contains two classes of pixels: the background and the foreground.
Furthermore, Otsu’s method makes the assumption that the grayscale histogram of our pixel intensities of our image is bi-modal, which simply means that the histogram is two peaks.
For example, take a look at the following image of a prescription pill and its associated grayscale histogram:
Notice how the histogram clearly has two peaks — the first sharp peak corresponds to the uniform background color of the image, while the second peak corresponds to the pill region itself.
If the concept of histograms is a bit confusing to you right now, don’t worry — we’ll be covering them in more detail in our image histograms blog post. But for the time being just understand that a histogram is simply a tabulation or a “counter” on the number of times a pixel value appears in the image.
Based on the grayscale histogram, Otsu’s method then computes an optimal threshold value T such that the variance between the background and foreground peaks is minimal.
However, Otsu’s method has no a priori knowledge of what pixels belong to the foreground and which pixels belong to the background — it’s simply trying to optimally separate the peaks of the histogram.
Let’s go ahead and take a look at some code to perform Otsu’s thresholding. Open the otsu_thresholding.py
file in your project directory structure and insert the following code:
# import the necessary packages import argparse import cv2 # construct the argument parser and parse the arguments ap = argparse.ArgumentParser() ap.add_argument("-i", "--image", type=str, required=True, help="path to input image") args = vars(ap.parse_args())
Lines 2 and 3 import our required Python packages while Lines 6-9 parse our command line arguments.
We only need a single switch here, --image
, which is the path to our input image that we wish to apply Otsu thresholding to.
We can now load and preprocess our image:
# load the image and display it image = cv2.imread(args["image"]) cv2.imshow("Image", image) # convert the image to grayscale and blur it slightly gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) blurred = cv2.GaussianBlur(gray, (7, 7), 0)
Lines 12 and 13 load our image from disk and display it on our screen.
We then apply preprocessing by converting the image to grayscale and blurring it to reduce high frequency noise.
Let’s now apply Otsu’s thresholding algorithm:
# apply Otsu's automatic thresholding which automatically determines # the best threshold value (T, threshInv) = cv2.threshold(blurred, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU) cv2.imshow("Threshold", threshInv) print("[INFO] otsu's thresholding value: {}".format(T)) # visualize only the masked regions in the image masked = cv2.bitwise_and(image, image, mask=threshInv) cv2.imshow("Output", masked) cv2.waitKey(0)
Applying Otsu’s method is handled on Line 21 and 22, again using the cv2.threshold
function of OpenCV.
We start by passing in the (blurred) image that we want to threshold. But take a look at the second parameter — this is supposed to be our threshold value T.
So why are we setting it to zero?
Remember that Otsu’s method is going to automatically compute the optimal value of T for us! We could technically specify any value we wanted for this argument; however, I like to supply a value of 0
as a type of “don’t care” parameter.
The third argument is the output value of the threshold, provided the given pixel passes the threshold test.
The last argument is one we need to pay extra special attention to. Previously, we had supplied values of cv2.THRESH_BINARY
or cv2.THRESH_BINARY_INV
depending on what type of thresholding we wanted to perform.
But now we are passing in a second flag that is logically OR’d with the previous method. Notice that this method is cv2.THRESH_OTSU
, which obviously corresponds to Otsu’s thresholding method.
The cv2.threshold
function will again return a tuple of 2 values for us: the threshold value T and the thresholded image itself.
In the previous section, the value of T returned was redundant and irrelevant — we already knew this value of T since we had to manually supply it.
But now that we are using Otsu’s method for automatic thresholding, this value of T becomes interesting — we do not know what the optimal value of T is ahead of time, hence why we are using Otsu’s method to compute it for us. Line 24 prints out the value of T as determined by Otsu’s method.
Finally, we display the output thresholded image to our screen on Lines 28 and 29.
Otsu thresholding results
To see Otsu’s method running, be sure to access the “Downloads” section of this tutorial to retrieve the source code and example images.
From there you can execute the following command:
$ python otsu_thresholding.py --image images/coins01.png [INFO] otsu's thresholding value: 191.0
Pretty nice, right? We didn’t even have to supply our value of T — Otsu’s method automatically took care of this for us. And we still got a nice threshold image as an output. And if we inspect our terminal we’ll see that Otsu’s method computed a value of T=191:
So based on our input image, the optimal value of T is 191
; therefore, any pixel p that is greater than 191
is set to 0
, and any pixel less than 191
is set to 255
(since we supplied the cv2.THRESH_BINARY_INV
flag detailed in section “Implementing simple thresholding with OpenCV” above).
Before we move on to the next example, let’s take a second and discuss what is meant by the term “optimal.” The value of T returned by Otsu’s method may not be optimal in visual investigation of our image — we can clearly see some gaps and holes in the coins of the thresholded image. But this value is optimal in the sense that it does the best possible job to split the foreground and the background assuming a bi-modal distribution of grayscale pixel values.
If the grayscale image does not follow a bi-modal distribution, then Otsu’s method will still run, but it may not give us our intended results. In that case, we will have to try adaptive thresholding, which we’ll cover in our next tutorial.
Anyway, let’s try a second image:
$ python otsu_thresholding.py --image images/coins02.png [INFO] otsu's thresholding value: 180.0
Again, notice that Otsu’s method has done a good job separating the foreground from the background for us. And this time, Otsu’s method has determined the optimal value of T to be 180
. Any pixel value greater than to 180
is set to 0
, and any pixel less than 180
is set to 255
(again, assuming inverse thresholding).
As you can see, Otsu’s method can save us a lot of time guessing and checking the best value of T. However, there are some major drawbacks.
The first is that Otsu’s method assumes a bi-modal distribution of the grayscale pixel intensities of our input image. If this is not the case, then Otsu’s method can return sub-par results.
Secondly, Otsu’s method is a global thresholding method. In situations where lighting conditions are semi-stable and the objects we want to segment have sufficient contrast from the background, we might be able to get away with Otsu’s method.
But when the lighting conditions are non-uniform — such as when different parts of the image are illuminated more than others, we can run into some serious problem. And when that’s the case, we’ll need to rely on adaptive thresholding (which we’ll discuss next week)
What's next? I recommend PyImageSearch University.
30+ total classes • 39h 44m video • Last updated: 12/2021
★★★★★ 4.84 (128 Ratings) • 3,000+ Students Enrolled
I strongly believe that if you had the right teacher you could master computer vision and deep learning.
Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?
That’s not the case.
All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.
If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.
Inside PyImageSearch University you'll find:
- ✓ 30+ courses on essential computer vision, deep learning, and OpenCV topics
- ✓ 30+ Certificates of Completion
- ✓ 39h 44m on-demand video
- ✓ Brand new courses released every month, ensuring you can keep up with state-of-the-art techniques
- ✓ Pre-configured Jupyter Notebooks in Google Colab
- ✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
- ✓ Access to centralized code repos for all 500+ tutorials on PyImageSearch
- ✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
- ✓ Access on mobile, laptop, desktop, etc.
Summary
In this lesson, we learned all about thresholding: what thresholding is, why we use thresholding, and how to perform thresholding using OpenCV and the cv2.threshold
function.
We started by performing simple thresholding, which requires us to manually supply a value of T to perform the threshold. However, we quickly realized that manually supplying a value of T is very tedious and requires us to hardcode this value, implying that this method will not work in all situations.
We then moved on to Otsu’s thresholding method, which automatically computes the optimal value of T for us, assuming a bi-modal distribution of the grayscale representation of our input image.
The problem here is that (1) our input images need to be bi-modal for Otsu’s method to correctly segment the image, and (2) Otsu’s method is a global thresholding method, which implies that we need to have at least some decent control over our lighting conditions.
In situations where our lighting conditions are less than ideal, or we simply cannot control them, we need adaptive thresholding (which is also known as local thresholding). We’ll be covering adaptive thresholding in our next tutorial.
To download the source code to this post (and be notified when future tutorials are published here on PyImageSearch), simply enter your email address in the form below!
Download the Source Code and FREE 17-page Resource Guide
Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!
Comment section
Hey, Adrian Rosebrock here, author and creator of PyImageSearch. While I love hearing from readers, a couple years ago I made the tough decision to no longer offer 1:1 help over blog post comments.
At the time I was receiving 200+ emails per day and another 100+ blog post comments. I simply did not have the time to moderate and respond to them all, and the sheer volume of requests was taking a toll on me.
Instead, my goal is to do the most good for the computer vision, deep learning, and OpenCV community at large by focusing my time on authoring high-quality blog posts, tutorials, and books/courses.
If you need help learning computer vision and deep learning, I suggest you refer to my full catalog of books and courses — they have helped tens of thousands of developers, students, and researchers just like yourself learn Computer Vision, Deep Learning, and OpenCV.
Click here to browse my full catalog.