Text skew correction with OpenCV and Python

Today’s tutorial is a Python implementation of my favorite blog post by Félix Abecassis on the process of text skew correction (i.e., “deskewing text”) using OpenCV and image processing functions.

Given an image containing a rotated block of text at an unknown angle, we need to correct the text skew by:

Detecting the block of text in the image.
Computing the angle of the rotated text.
Rotating the image to correct for the skew.

We typically apply text skew correction algorithms in the field of automatic document analysis, but the process itself can be applied to other domains as well.

To learn more about text skew correction, just keep reading.

Looking for the source code to this post?

Text skew correction with OpenCV and Python

The remainder of this blog post will demonstrate how to deskew text using basic image processing operations with Python and OpenCV.

We’ll start by creating a simple dataset that we can use to evaluate our text skew corrector.

We’ll then write Python and OpenCV code to automatically detect and correct the text skew angle in our images.

Creating a simple dataset

Similar to Félix’s example, I have prepared a small dataset of four images that have been rotated by a given number of degrees:

**Figure 1:** Our four example images that we’ll be applying text skew correction to with OpenCV and Python.

The text block itself is from Chapter 11 of my book, Practical Python and OpenCV, where I’m discussing contours and how to utilize them for image processing and computer vision.

The filenames of the four files follow:

$ ls images/
neg_28.png	neg_4.png	pos_24.png	pos_41.png

The first part of the filename specifies whether our image has been rotated counter-clockwise (negative) or clockwise (positive).

The second component of the filename is the actual number of degrees the image has been rotated by.

The goal our text skew correction algorithm will be to correctly determine the direction and angle of the rotation, then correct for it.

To see how our text skew correction algorithm is implemented with OpenCV and Python, be sure to read the next section.

Deskewing text with OpenCV and Python

To get started, open up a new file and name it correct_skew.py .

From there, insert the following code:

# import the necessary packages
import numpy as np
import argparse
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to input image file")
args = vars(ap.parse_args())

# load the image from disk
image = cv2.imread(args["image"])

Lines 2-4 import our required Python packages. We’ll be using OpenCV via our cv2 bindings, so if you don’t already have OpenCV installed on your system, please refer to my list of OpenCV install tutorials to help you get your system setup and configured.

We then parse our command line arguments on Lines 7-10. We only need a single argument here, --image , which is the path to our input image.

The image is then loaded from disk on Line 13.

Our next step is to isolate the text in the image:

# convert the image to grayscale and flip the foreground
# and background to ensure foreground is now "white" and
# the background is "black"
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.bitwise_not(gray)

# threshold the image, setting all foreground pixels to
# 255 and all background pixels to 0
thresh = cv2.threshold(gray, 0, 255,
	cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]

Our input images contain text that is dark on a light background; however, to apply our text skew correction process, we first need to invert the image (i.e., the text is now light on a dark background — we need the inverse).

When applying computer vision and image processing operations, it’s common for the foreground to be represented as light while the background (the part of the image we are not interested in) is dark.

A thresholding operation (Lines 23 and 24) is then applied to binarize the image:

**Figure 2:** Applying a thresholding operation to binarize our image. Our text is now white on a black background.

Given this thresholded image, we can now compute the minimum rotated bounding box that contains the text regions:

# grab the (x, y) coordinates of all pixel values that
# are greater than zero, then use these coordinates to
# compute a rotated bounding box that contains all
# coordinates
coords = np.column_stack(np.where(thresh > 0))
angle = cv2.minAreaRect(coords)[-1]

# the `cv2.minAreaRect` function returns values in the
# range [-90, 0); as the rectangle rotates clockwise the
# returned angle trends to 0 -- in this special case we
# need to add 90 degrees to the angle
if angle < -45:
	angle = -(90 + angle)

# otherwise, just take the inverse of the angle to make
# it positive
else:
	angle = -angle

Line 30 finds all (x, y)-coordinates in the thresh image that are part of the foreground.

We pass these coordinates into cv2.minAreaRect which then computes the minimum rotated rectangle that contains the entire text region.

The cv2.minAreaRect function returns angle values in the range [-90, 0). As the rectangle is rotated clockwise the angle value increases towards zero. When zero is reached, the angle is set back to -90 degrees again and the process continues.

Note: For more information on cv2.minAreaRect , please see this excellent explanation by Adam Goodwin.

Lines 37 and 38 handle if the angle is less than -45 degrees, in which case we need to add 90 degrees to the angle and take the inverse.

Otherwise, Lines 42 and 43 simply take the inverse of the angle.

Now that we have determined the text skew angle, we need to apply an affine transformation to correct for the skew:

# rotate the image to deskew it
(h, w) = image.shape[:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(image, M, (w, h),
	flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)

Lines 46 and 47 determine the center (x, y)-coordinate of the image. We pass the center coordinates and rotation angle into the cv2.getRotationMatrix2D (Line 48). This rotation matrix M is then used to perform the actual transformation on Lines 49 and 50.

Finally, we display the results to our screen:

# draw the correction angle on the image so we can validate it
cv2.putText(rotated, "Angle: {:.2f} degrees".format(angle),
	(10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)

# show the output image
print("[INFO] angle: {:.3f}".format(angle))
cv2.imshow("Input", image)
cv2.imshow("Rotated", rotated)
cv2.waitKey(0)

Line 53 draws the angle on our image so we can verify that the output image matches the rotation angle (you would obviously want to remove this line in a document processing pipeline).

Lines 57-60 handle displaying the output image.

Skew correction results

To grab the code + example images used inside this blog post, be sure to use the “Downloads” section at the bottom of this post.

From there, execute the following command to correct the skew for our neg_4.png image:

$ python correct_skew.py --image images/neg_4.png 
[INFO] angle: -4.086

**Figure 3:** Applying skew correction using OpenCV and Python.

Here we can see that that input image has a counter-clockwise skew of 4 degrees. Applying our skew correction with OpenCV detects this 4 degree skew and corrects for it.

Here is another example, this time with a counter-clockwise skew of 28 degrees:

$ python correct_skew.py --image images/neg_28.png 
[INFO] angle: -28.009

**Figure 4:** Deskewing images using OpenCV and Python.

Again, our skew correction algorithm is able to correct the input image.

This time, let’s try a clockwise skew:

$ python correct_skew.py --image images/pos_24.png 
[INFO] angle: 23.974

**Figure 5:** Correcting for skew in text regions with computer vision.

And finally a more extreme clockwise skew of 41 degrees:

$ python correct_skew.py --image images/pos_41.png 
[INFO] angle: 41.037

**Figure 6:** Deskewing text with OpenCV.

Regardless of skew angle, our algorithm is able to correct for skew in images using OpenCV and Python.

What's next? I recommend PyImageSearch University.

Course information:
30+ total classes • 39h 44m video • Last updated: 12/2021
★★★★★ 4.84 (128 Ratings) • 3,000+ Students Enrolled

I strongly believe that if you had the right teacher you could master computer vision and deep learning.

Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?

That’s not the case.

All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.

If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.

Inside PyImageSearch University you'll find:

✓ 30+ courses on essential computer vision, deep learning, and OpenCV topics
✓ 30+ Certificates of Completion
✓ 39h 44m on-demand video
✓ Brand new courses released every month, ensuring you can keep up with state-of-the-art techniques
✓ Pre-configured Jupyter Notebooks in Google Colab
✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
✓ Access to centralized code repos for all 500+ tutorials on PyImageSearch
✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
✓ Access on mobile, laptop, desktop, etc.

Click here to join PyImageSearch University

Summary

In today’s blog post I provided a Python implementation of Félix Abecassis’ approach to skew correction.

The algorithm itself is quite straightforward, relying on only basic image processing techniques such as thresholding, computing the minimum area rotated rectangle, and then applying an affine transformation to correct the skew.

We would commonly use this type of text skew correction in an automatic document analysis pipeline where our goal is to digitize a set of documents, correct for text skew, and then apply OCR to convert the text in the image to machine-encoded text.

I hope you enjoyed today’s tutorial!

To be notified when future blog posts are published, be sure to enter your email address in the form below!

Download the Source Code and FREE 17-page Resource Guide

Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!

About the Author

Hi there, I’m Adrian Rosebrock, PhD. All too often I see developers, students, and researchers wasting their time, studying the wrong things, and generally struggling to get started with Computer Vision, Deep Learning, and OpenCV. I created this website to show you what I believe is the best possible way to get your start.

55 responses to: Text skew correction with OpenCV and Python

Doug

February 20, 2017 at 12:29 pm

I would be very interested in how to extend this technique for 3 dimensions. e.g.: The common case where someone has taken a picture of some text using a camera phone, but from an off angle.
- Adrian Rosebrock
  
  February 20, 2017 at 1:10 pm
  
  This thread on Twitter was just brought to my attention and would likely be helpful for you.
  - Doug
    
    February 21, 2017 at 9:05 am
    
    That looks perfect. Thank you.
Adam

February 20, 2017 at 1:50 pm

This method works nice for perfect scans (without noise) of justified text (or at least left or right aligned with many lines). Better approach would be detecting blank space between lines and finding mean angle of lines fitting in this space. Is there a way to do this efficiently?
- Adrian Rosebrock
  
  February 22, 2017 at 1:44 pm
  
  For a more robust approach, take a look at the link in my reply to “Doug” above.
Carlos

February 20, 2017 at 3:38 pm

I implemented this technique in an application some time ago. It is simple and fast. But it is not very suitable when you have more complex layouts such as forms. The technique that work more accurately was to rotate the image in several angles from negative to positive and count what angle produced more white pixels (on binarized iamge) in every row.
But this is too slow. Could you point me to some approach that is faster than this, please?
- Adrian Rosebrock
  
  February 22, 2017 at 1:44 pm
  
  If you are working with form images I think it would be best to match areas of the forms to a template form rather than applying skew correction.
  - Rakesh
    
    November 27, 2019 at 5:51 am
    
    Hello Adrian,
    
    Thanks for sharing this blog. Currently, I’m also working on complex layouts such as bank application forms. could be please share some material or link etc. to deal with the skewness issue in the bank form.
    
    Thanks,
    Rakesh
sumant

February 21, 2017 at 10:02 am

I get the right bounding boxes after threshold only if i swap indexes for np.where on line no 30.This would necessitate a inverse of the angle on line 48
Rohit

March 21, 2017 at 3:49 am

Thanks Adrian for this informative post. It took me time to figure out how cv2.minAreaRect works. If the angle returned by the cv2.minAreaRect is bordering -45 ( let say -50 or -55), will this code not lead to the text being skewed in a perpendicular direction rather than a horizontal direction? Just asking out of curiosity

Thanks again
- Adrian Rosebrock
  
  March 21, 2017 at 7:05 am
  
  I’m not sure what you mean, but Lines 37-43 handle this case. I would test out the code on an example image.
Filozof50

March 21, 2017 at 10:42 am

YOU ARE A GOD!!! Thanks alot! You did amazing job here. 🙂
zara

March 27, 2017 at 9:00 am

Hi adrain.

coords = np.column_stack(np.where(thresh > 0)) what this function does? can we replace this function with coords = cv2.findNonZero(thresh). What about image if it is portrait?
- Adrian Rosebrock
  
  March 28, 2017 at 1:00 pm
  
  The np.where function returns the indexes of the thresh array that have a pixel value greater than zero. Calling np.column_stack turns them into (x, y)-coordinates.
  - Eric Xu
    
    October 7, 2018 at 7:23 am
    
    actually, np.where function returns (y,x)
    
    image = np.array([
    [0, 0, 1],
    [0, 1, 1],
    [0, 0, 0]])
    
    y,x = np.where(image > 0)
    coords = np.column_stack((x,y))
    print(coords)
    ”’
    [[2 0]
    [1 1]
    [2 1]]
    ”’
Mathew Orman

April 4, 2017 at 8:12 am

This is not usfull, it rotates rectangle not text. If image contains rotated and croped text this code returns angle = 0.0 and does not correct the skiew…
- zara
  
  April 20, 2017 at 4:01 am
  
  yeah. In my case also it ‘s behaving like that only.
OpenCV Learner

June 7, 2017 at 9:42 am

Dear Adrian,

I have used your codes and help so many times but never gave thanks.
Currently I am working on a summer project (just for fun) and I needed something like this to process rotated musical sheets. Works perfectly!

So many thanks to you and keep up the good work!
- Adrian Rosebrock
  
  June 9, 2017 at 1:48 pm
  
  Thank you for the comment, I’m happy to hear the tutorial helped you! 🙂 Best of luck with the rest of your summer projects. Projects like yours are the best way to learn. Keep it up!
Gabriel

June 10, 2017 at 10:12 pm

Hi Adrian,
I’m using your code but getting the “cords” using cv2.findContours.
When doing that, my object was being rotated to the wrong direction. To fix that I needed to change “angle = -(90 + angle)” to “angle = (90 + angle)”; removed the minus sign.
What is the difference between the way you got the cords and “cv2.findContours”?
- Adrian Rosebrock
  
  June 13, 2017 at 11:09 am
  
  The method I used takes all thresholded (x, y)-coordinates and uses them to estimate the rotation. The cv2.findContours function will only find the outline of the region, provided you are computing the bounding box.
Eddie

October 26, 2017 at 7:59 pm

I am trying to figure out how to make a box around the text bu this is not working very well…

rect = cv2.minAreaRect(coords)
box = cv2.boxPoints(rect)
box = np.int0(box)
cv2.drawContours(image,[box],0,(0,255,0),2)

Any ideas? Thanks!
- Adrian Rosebrock
  
  October 27, 2017 at 11:10 am
  
  Hi Eddie — what errors are you getting, if any? What is the format of coords? Without being on your system or having additional information, this is hard for me to solve.
- Nasser R. Sanchez
  
  November 22, 2017 at 8:03 pm
  
  Hi Adrian,
  
  First thank you so much for your great blog!
  About Eddie’s question, I was also trying to paint a box around the text using pretty much the same instructions, there are no errors, just the rectangle comes up skew some degrees down clockwise (function: cv2.drawContours) but the angle comes up correct (function cv2.minAreaRect(coords)).
  But the after some hours of trying and searching on internet, I realize it is something about the way numpy and opencv use the coordinate system.
  For now what I did was swap the colums given by the numpy where function
  coords = np.column_stack(np.where(wb > 0)[::-1])
  And also change the angle calculation part, I think is also the problem that Gabriel has using the cv2.findContours to have the coords.
  What I would like to ask you is, if just swapping the columns is the way to do it or if there is a better way.
  Again thank you for your work done in your blog.
  - Adrian Rosebrock
    
    November 25, 2017 at 12:42 pm
    
    Hi Nasser — swapping the columns seems perfectly reasonable here.
Lucian

November 8, 2017 at 8:54 am

Somehow, the gray = cv2.bitwise_not(gray) make the output image blurred in some parts. If I comment this variable, the program not work anymore.
How to avoid that, the blur ?
- Adrian Rosebrock
  
  November 9, 2017 at 6:31 am
  
  Hi Lucian — I haven’t encountered this issue before with cv2.bitwise_not so I’m honestly not sure what the problem is here. Are you using the example images from the blog post or your own images?
rex

January 9, 2018 at 5:31 pm

the part you put [-90, 0)
is it a typo or it means something different?
- Adrian Rosebrock
  
  January 10, 2018 at 12:49 pm
  
  That is the range of the interval. “[” means “including” while “)” means “does not include”.
Lorenzo

January 11, 2018 at 8:08 am

Hi Adrian, your code works beautifully for long lines but it’s easily confused by capitalized single words. Take the word “Example” for example 🙂 especially with a big graphic capital “E”. A similar situations is with words like “roll” where the final “l” ruins the game.
I’m thinking about looking for the floor of the words rather than a full box (I have single lines only), maybe find the bottom black pixel for each column and find the best line that fits best. I think I cannot just pick two, max and min, and fit a line through those, because letters with a long leg, like “p” may break it.
Maybe there is something simpler, maybe cutting the line in half and using hough lines? How would you approach this?

Thanks
- Adrian Rosebrock
  
  January 11, 2018 at 1:10 pm
  
  There are a few ways to do this. One method would be to find the min and max (x, y)-values for a line. You could also use thresholding + morphological operations to close the gap in between lines as well.
Ron

March 12, 2018 at 6:34 am

Hi Adrian,

Can I use the same approach for detecting Italic fonts?
- Adrian Rosebrock
  
  March 14, 2018 at 1:08 pm
  
  The italic fonts would be naturally skewed so I think this method wouldn’t perform as well. Try it and see.
lambert

August 24, 2018 at 8:10 am

Hi,Adrian!
I’m a starter when working with opencv,
“angle = cv2.minAreaRect(coords)[-1]”,
what is the “[-1]” mean?
- Adrian Rosebrock
  
  August 24, 2018 at 8:27 am
  
  The cv2.minAreaRect function returns multiple values. The “-1” says we want the final value in the tuple. Be sure to read up on Python indexing as well as array slicing as well.
  - lambert
    
    August 25, 2018 at 3:16 am
    
    thank you Adrian,it helps me a lot!
    i’m from China and i think your tutorials about opencv is very practical and easy for starter to read,i will recommand more friends to study your tutorials.
    - Adrian Rosebrock
      
      August 30, 2018 at 9:36 am
      
      Thank you Lambert! 🙂
Anshu

September 11, 2018 at 6:52 am

it doesnt work for image which is greater than +-40 degrees.. any way to fix it
Leks

September 17, 2018 at 11:22 am

Is it applicable to a live camera stream? I always get angle 0.0
Samarth

October 26, 2018 at 3:39 am

how can i call just preprocessing and image skewing script in my c# project. To be specific what argument and how it has to be passed.
- Adrian Rosebrock
  
  October 29, 2018 at 1:43 pm
  
  Sorry, I don’t have any experience with OpenCV and C# together. First you should check if there is an OpenCV and C# integration (I assume there is). From there refer to the documentation for the proper function calls.
Ashish

January 16, 2019 at 5:37 am

How can I make this work for ID cards. Instead of plain text images I have ID cards (driving license) but the program gives 0 degree.
Shreeja

February 13, 2019 at 7:09 am

Hey Adrian,

what if some of my images are vertical, how do i make them horizontal using this code?
- Adrian Rosebrock
  
  February 14, 2019 at 12:58 pm
  
  You can use the cv2.rotate or imutils.rotate function to change the orientation of your images.
zhila

February 19, 2019 at 11:30 am

Adrian, I am beginner in python, I could not execute this code. I have this error usage: text rotation.py [-h] -i IMAGE
text rotation.py: error: the following arguments are required: -i/–image.
- Adrian Rosebrock
  
  February 20, 2019 at 12:13 pm
  
  You need to pass the command line arguments to the script.
Avinash

April 22, 2019 at 10:02 am

what if i have to do this on a image with black background?
- Adrian Rosebrock
  
  April 25, 2019 at 8:57 am
  
  You invert the image using cv2.bitwise_not.
William

May 24, 2019 at 5:20 pm

Hello, thanks for these articles! I am wondering if you have a suggestion for upside down images of text?
Philipe Huan

June 3, 2019 at 11:11 am

hi, is there a way to apply the code in an image like a face image?
- Adrian Rosebrock
  
  June 6, 2019 at 7:59 am
  
  You mean face alignment?
tamilselvan

June 20, 2019 at 4:51 am

This method will work good when there is 90 degree rotation. If the image is rotated at 180 degree, this method would be failing. Is there any ways to solve this problem?
Tanishk Sachdeva

July 18, 2019 at 3:39 am

Hi Adrian is there any way where we can train a model to automatically detect the pos/neg and the angle so that we can use this approach in production where the user just uploads the image and we can align it.
Rakesh

March 12, 2020 at 7:37 am

Hello Adrian,
The code fails to perform skew correction when the angle is in range ( -2,0) or say for a small angle of inclination.

Any suggestions to improve efficiency?
Looking forward to hearing from you.

Trackbacks

Installing Tesseract for OCR - PyImageSearch says:

July 3, 2017 at 10:00 am

[…] text skew correction to the input image to ensure the text is properly […]

Comment section

Hey, Adrian Rosebrock here, author and creator of PyImageSearch. While I love hearing from readers, a couple years ago I made the tough decision to no longer offer 1:1 help over blog post comments.

At the time I was receiving 200+ emails per day and another 100+ blog post comments. I simply did not have the time to moderate and respond to them all, and the sheer volume of requests was taking a toll on me.

Instead, my goal is to do the most good for the computer vision, deep learning, and OpenCV community at large by focusing my time on authoring high-quality blog posts, tutorials, and books/courses.

If you need help learning computer vision and deep learning, I suggest you refer to my full catalog of books and courses — they have helped tens of thousands of developers, students, and researchers just like yourself learn Computer Vision, Deep Learning, and OpenCV.

Click here to browse my full catalog.

Looking for the source code to this post?

Text skew correction with OpenCV and Python

Creating a simple dataset

Deskewing text with OpenCV and Python

Skew correction results

What's next? I recommend PyImageSearch University.

Summary

Download the Source Code and FREE 17-page Resource Guide

About the Author

55 responses to: Text skew correction with OpenCV and Python

Trackbacks

Comment section

Breaking captchas with deep learning, Keras, and TensorFlow

Bank check OCR with OpenCV and Python (Part II)

OpenCV Super Resolution with Deep Learning

Topics

Books & Courses

PyImageSearch

Looking for the source code to this post?

Text skew correction with OpenCV and Python

Creating a simple dataset

Deskewing text with OpenCV and Python

Skew correction results

What's next? I recommend PyImageSearch University.

Summary

Download the Source Code and FREE 17-page Resource Guide

About the Author

Reader Interactions

Recognizing digits with OpenCV and Python

ImageNet: VGGNet, ResNet, Inception, and Xception with Keras

55 responses to: Text skew correction with OpenCV and Python

Trackbacks

Comment section

Similar articles

You can learn Computer Vision, Deep Learning, and OpenCV.

Footer

Topics

Books & Courses

PyImageSearch

Access the code to this tutorial and all other 500+ tutorials on PyImageSearch

What's included in PyImageSearch University?