Detecting and OCR’ing Digits with Tesseract and Python

In a previous tutorial, we implemented our very first OCR project. We saw that Tesseract worked well on some images but returned total nonsense for other examples. Part of being a successful OCR practitioner is learning that when you see this garbled, nonsensical output from Tesseract, it means some combination of (1) your image pre-processing techniques and (2) your Tesseract OCR options are incorrect.

To learn how to detect and OCR digits with Tesseract and Python, just keep reading.

Looking for the source code to this post?

Detecting and OCR’ing Digits with Tesseract and Python

Tesseract is a tool, like any other software package. Just like a data scientist can’t simply import millions of customer purchase records into Microsoft Excel and expect Excel to recognize purchase patterns automatically, it’s unrealistic to expect Tesseract to figure out what you need to OCR automatically and correctly output it.

Instead, it would help if you learned how to configure Tesseract properly for the task at hand. For example, suppose you are tasked with creating a computer vision application to automatically OCR business cards for their phone numbers.

How would you go about building such a project? Would you try to OCR the entire business card and then use a combination of regular expressions and post-processing pattern recognition to parse out the digits?

Or would you take a step back and examine the Tesseract OCR engine itself — is it possible to tell Tesseract to only OCR digits?

It turns out there is. And that’s what we’ll cover in this tutorial.

Learning Objectives

In this tutorial, you will:

Gain hands-on experience OCR’ing digits from input images
Extend our previous OCR script to handle digit recognition
Learn how to configure Tesseract to only OCR digits
Pass in this configuration to Tesseract via the pytesseract library

Configuring your development environment

To follow this guide, you need to have the OpenCV library installed on your system.

Luckily, OpenCV is pip-installable:

$ pip install opencv-contrib-python

If you need help configuring your development environment for OpenCV, I highly recommend that you read my pip install OpenCV guide — it will have you up and running in a matter of minutes.

Having problems configuring your development environment?

**Figure 1:** Having trouble configuring your dev environment? Want access to pre-configured Jupyter Notebooks running on Google Colab? Be sure to join PyImageSearch University — you’ll be up and running with this tutorial in a matter of minutes.

All that said, are you:

Short on time?
Learning on your employer’s administratively locked system?
Wanting to skip the hassle of fighting with the command line, package managers, and virtual environments?
Ready to run the code right now on your Windows, macOS, or Linux system?

Then join PyImageSearch University today!

Gain access to Jupyter Notebooks for this tutorial and other PyImageSearch guides that are pre-configured to run on Google Colab’s ecosystem right in your web browser! No installation required.

And best of all, these Jupyter Notebooks will run on Windows, macOS, and Linux!

Digit Detection and Recognition with Tesseract

In the first part of this tutorial, we’ll review digit detection and recognition, including real-world problems where we may wish to OCR only digits.

From there, we’ll review our project directory structure, and I’ll show you how to perform digit detection and recognition with Tesseract. We’ll wrap up this tutorial with a review of our digit OCR results.

What Is Digit Detection and Recognition?

As the name suggests, digit recognition is the process of OCR’ing and identifying only digits, purposely ignoring other characters. Digit recognition is often applied to real-world OCR projects (a montage of which can be seen in Figure 2), including:

Extracting information from business cards
Building an intelligent water monitor reader
Bank check and credit card OCR

**Figure 2.** A montage of different applications of digit detection and recognition. Water meter image source: pyimg.co/vlx63.

Our goal here is to “cut through the noise” of the non-digit characters. We will instead “laser in” on the digits. Luckily, accomplishing this digit recognition task is relatively easy once we supply the correct parameters to Tesseract.

Project Structure

Let’s review the directory structure for this project:

|-- apple_support.png
|-- ocr_digits.py

Our project consists of one testing image (apple_support.png) and our ocr_digits.py Python script. The script accepts an image and an optional “digits only” setting and reports OCR results accordingly.

OCR’ing Digits with Tesseract and OpenCV

We are now ready to OCR digits with Tesseract. Open a new file, name it ocr_digits.py, and insert the following code:

# import the necessary packages
import pytesseract
import argparse
import cv2

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to input image to be OCR'd")
ap.add_argument("-d", "--digits", type=int, default=1,
	help="whether or not *digits only* OCR will be performed")
args = vars(ap.parse_args())

As you can see, we’re using the PyTesseract package in conjunction with OpenCV. After our imports are taken care of, we parse two command line arguments:

--image: Path to the image to be OCR’d
--digits: A flag indicating whether or not we should OCR digits only (by default, the option is set to a True Boolean)

Let’s go ahead and load our image and perform OCR:

# load the input image, convert it from BGR to RGB channel ordering,
# and initialize our Tesseract OCR options as an empty string
image = cv2.imread(args["image"])
rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
options = ""

# check to see if *digit only* OCR should be performed, and if so,
# update our Tesseract OCR options
if args["digits"] > 0:
	options = "outputbase digits"

# OCR the input image using Tesseract
text = pytesseract.image_to_string(rgb, config=options)
print(text)

Tesseract requires RGB color channel ordering for performing OCR. Lines 16 and 17 load the input --image and swap color channels accordingly.

We then establish our Tesseract options (Lines 18–23). Configuring Tesseract with options allows for more granular control over Tesseract’s methods under the hood to perform OCR.

For now, our options are either empty (Line 18) or outputbase digits, indicating that we will only OCR digits on the input image (Lines 22 and 23).

From there, we use the image_to_string function call while passing our rgb image and our configuration options (Line 26). Notice that we’re using the config parameter and including the digits only setting if the --digits command line argument Boolean is True.

Finally, we show the OCR text results in our terminal (Line 27). Let’s see if those results meet our expectations.

Digit OCR Results

We are now ready to OCR digits with Tesseract.

Open a terminal and execute the following command:

$ python ocr_digits.py --image apple_support.png
1-800-275-2273

As input to our ocr_digits.py script, we’ve supplied a sample business card-like image that contains the text “Apple Support,” along with the corresponding phone number (Figure 3). Our script can correctly OCR the phone number, displaying it to our terminal while ignoring the “Apple Support” text.

**Figure 3.** Business card of Apple support.

One of the problems of using Tesseract via the command line or with the image_to_string function is that it becomes quite hard to debug exactly how Tesseract arrived at the final output.

Once we gain some more experience working with the Tesseract OCR engine, we’ll turn our attention to visually debugging and eventually filtering out extraneous characters via confidence/probability scores. For the time being, please pay attention to the options and configurations we’re supplying to Tesseract to accomplish our goals (i.e., digit recognition).

If you instead want to OCR all characters (not just limited to digits), you can set the --digits command line argument to any value ≤0:

$ python ocr_digits.py --image apple_support.png --digits 0
a
Apple Support
1-800-275-2273

Notice how the “Apple Support” text is now included with the phone number in the OCR Output. But what’s up with that “a” in the output? Where is that coming from?

The “a” in the output is Tesseract confusing the leaf at the top of the Apple logo as an alphabet (Figure 4).

**Figure 4.** The leaf at the *top* of the Apple logo confuses Tesseract into treating that as an alphabet.

What's next? I recommend PyImageSearch University.

Course information:
30+ total classes • 39h 44m video • Last updated: 12/2021
★★★★★ 4.84 (128 Ratings) • 3,000+ Students Enrolled

I strongly believe that if you had the right teacher you could master computer vision and deep learning.

Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?

That’s not the case.

All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.

If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.

Inside PyImageSearch University you'll find:

✓ 30+ courses on essential computer vision, deep learning, and OpenCV topics
✓ 30+ Certificates of Completion
✓ 39h 44m on-demand video
✓ Brand new courses released every month, ensuring you can keep up with state-of-the-art techniques
✓ Pre-configured Jupyter Notebooks in Google Colab
✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
✓ Access to centralized code repos for all 500+ tutorials on PyImageSearch
✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
✓ Access on mobile, laptop, desktop, etc.

Click here to join PyImageSearch University

Summary

In this tutorial, you learned how to configure Tesseract and pytesseract to OCR only digits. We then used our Python script to handle OCR’ing the digits.

You’ll want to pay close attention to the config and options we supply to Tesseract. Frequently, being able to apply OCR successfully to a Tesseract project depends on providing the correct set of configurations.

In our next tutorial, we’ll continue exploring Tesseract options by learning how to whitelist and blacklist a custom set of characters.

To download the source code to this post (and be notified when future tutorials are published here on PyImageSearch), simply enter your email address in the form below!

Download the Source Code and FREE 17-page Resource Guide

Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!

Looking for the source code to this post?

Detecting and OCR’ing Digits with Tesseract and Python

Learning Objectives

Configuring your development environment

Having problems configuring your development environment?

Digit Detection and Recognition with Tesseract

What Is Digit Detection and Recognition?

Project Structure

OCR’ing Digits with Tesseract and OpenCV

Digit OCR Results

What's next? I recommend PyImageSearch University.

Summary

Download the Source Code and FREE 17-page Resource Guide

About the Author

Comment section

Stochastic Gradient Descent (SGD) with Python

macOS: Install OpenCV 3 and Python 2.7

I just open sourced my personal imutils package: A series of OpenCV convenience functions.

Topics

Books & Courses

PyImageSearch

Looking for the source code to this post?

Detecting and OCR’ing Digits with Tesseract and Python

Learning Objectives

Configuring your development environment

Having problems configuring your development environment?

Digit Detection and Recognition with Tesseract

What Is Digit Detection and Recognition?

Project Structure

OCR’ing Digits with Tesseract and OpenCV

Digit OCR Results

What's next? I recommend PyImageSearch University.

Summary

Download the Source Code and FREE 17-page Resource Guide

About the Author

Your First OCR Project with Tesseract and Python

Whitelisting and Blacklisting Characters with Tesseract and Python

Comment section

Similar articles

You can learn Computer Vision, Deep Learning, and OpenCV.

Footer

Topics

Books & Courses

PyImageSearch

Access the code to this tutorial and all other 500+ tutorials on PyImageSearch

What's included in PyImageSearch University?