A few days ago I mentioned that on Wednesday, August 19th at 10AM EDT I am launching an IndieGoGo crowdfunding campaign for my new book, OCR with OpenCV, Tesseract, and Python. (Note: The campaign is now complete. But you can still pre-order your copy by clicking here.)
Today I’m going to share with you:
- The Table of Contents to the book
- Additional details on how the book is structured
- What is included in the book, including source code, pre-configured VM, access to the private community forums, etc.
- The Certificate of Completion you’ll be awarded after completing all quizzes/exams associated with the text
Let’s dive in!
What is this book?
OCR with OpenCV, Tesseract, and Python will teach you how to successfully apply Optical Character Recognition to your work, projects, and research.
You will learn via practical, hands-on projects (with lots of code) so you can not only develop your own OCR Projects, but feel confident while doing so.
Inside the book we will focus on:
- Getting started with OCR
- Learning the basics of the Tesseract OCR engine
- Discovering how to improve OCR accuracy using Tesseract options and configurations
- Interfacing with Tesseract via the Python programming language
- Localizing and detecting text in images using both OpenCV and Tesseract
- Using OpenCV and image processing techniques to improve OCR accuracy
- Using machine learning to denoise our images for better OCR accuracy
- Image/document registration and alignment to build an invoice scanning project
- Training our own custom deep learning models with Keras and TensorFlow
- Solving Sudoku puzzles with OCR, OpenCV, and Keras/TensorFlow
- Automatic License/Number Plate Recognition (ANPR)
- Handwriting recognition
- Performing OCR in real-time video streams
- Utilizing GPUs for faster OCR inference
- Using OCR engines in the cloud, including Amazon Rekognition, Microsoft Cognitive Services, and the Google Vison API
- Tips, suggestions, and best practices when performing OCR
Currently I have 35+ chapters planned out, with more to come!
How is this book structured?
Since we’ll be covering so many OCR techniques in-depth, I’ve decided to break the book down into three volumes called “bundles”.
I’ve included a short breakdown of the three bundles below:
The “Intro to OCR” Bundle is right for you if:
- You are new to the world of OCR and Computer Vision
- You are just testing the OCR waters
- You are on a budget
Inside this bundle you will learn the fundamentals of Optical Character Recognition using Tesseract, OpenCV, and Python. And while this is the lowest tier bundle, you’ll still be getting a great education with a lot of hands-on experience.
A full list of chapter topics follows:
- Introduction
- What is Optical Character Recognition (OCR)?
- Tools, libraries, and packages for OCR
- Installing our OCR libraries and tools
- Your first OCR example with Tesseract
- Detecting digits with Tesseract
- Whitelisting and blacklisting characters with Tesseract
- Determining and correcting text orientation with Tesseract
- OCR’ing text and translating to different languages
- Using Tesseract with non-English languages
- Improving OCR accuracy with Tesseract Page Segmentation Modes (PSMs)
- Improving OCR results with OpenCV and image processing
- Utilizing spellchecking with OCR
- OCR’ing passports using computer vision
- Using OpenCV and template matching to OCR characters
- OCR’ing characters with basic computer vision and image processing
- Text bounding box localization and OCR with Tesseract
- Rotated text bounding box localization with OpenCV
- A complete text detection and OCR pipeline
- Conclusions
The chapters inside the “Intro to OCR” Bundle will give you a strong foundation to build upon. For a more in-depth treatment of OCR, I would recommend either the “OCR Practitioner” Bundle or “OCR Expert” Bundle.
My Recommendation: The “Intro to OCR” Bundle is a great first step towards applying OCR to real-world projects. You’ll learn the fundamentals of OCR and Tesseract, empowering you to apply OCR to your own projects.
That said, if you are going with this bundle because you’re new to the world of computer vision and OCR, then you should absolutely look at the Practical Python and OpenCV and PyImageSearch Gurus add-ons. Both of these can be used to help you level-up your computer vision skills quickly (and be more successful when applying OCR).
The “OCR Practitioner” Bundle builds on the previous bundle and includes every chapter in the “Intro to OCR” Bundle. This bundle is geared towards more advanced OCR algorithms, techniques, and use cases, including deep learning, image/document alignment, OCR in real-time video streams, OCR with GPUs, cloud-based OCR APIs, and more!
Not only will you be getting every chapter in the “Intro to OCR” Bundle, but you’ll also receive the following:
- Introduction
- Training custom OCR models with Keras and TensorFlow
- Using machine learning to denoise images for better OCR accuracy
- Image and document registration
- Automatically aligning and OCR’ing a document, invoice, form, etc.
- Building an OpenCV Sudoku solver using OCR
- OCR’ing receipts
- Automatic License/Number Plate Recognition with OCR
- Text blur detection
- OCR’ing real-time video streams
- Improving text detection speed with OpenCV and GPUs
- Handwriting recognition
- Text detection and OCR with the Amazon Rekognition API
- Using the Microsoft Cognitive Services API for OCR
- OCR with the Google Vision API
- Training custom Tesseract OCR models
- Fine-tuning Tesseract OCR models
- Utilizing the EasyOCR package for fast, efficient OCR
- Conclusions
My Recommendation: The “OCR Practitioner Bundle” gives you the best bang for your buck. You should choose this bundle if you want a super in-depth treatment of OCR, but cannot afford the “OCR Expert” Bundle.
If you’re new to computer vision and deep learning, I highly suggest you also get the PyImageSearch Gurus and/or Deep Learning for Computer Vision with Python add-ons — both of these resources will teach you computer vision and deep learning quickly (ensuring you get more value out of your purchase of the OCR book).
The “OCR Expert” Bundle includes everything from both the “Intro to OCR” Bundle and “OCR Practitioner” Bundle.
It also includes:
- All bonus chapters from stretch goals during the IndieGoGo campaign (including chapters that are authored after the campaign has ended).
- A physical, printed edition of all three volumes of OCR with OpenCV, Tesseract, and Python — this is the only bundle that includes a hardcopy edition.
- Access to my private community forums for additional help and support. You’ll get faster, more detailed answers to your questions and you’ll be able to better connect with myself and other readers. (again, the other two bundles do not have access to these forums).
- A Certificate of Completion upon successfully completing all lessons and quizzes associated with the text.
My Recommendation: You should go with the “OCR Expert” Bundle if (1) you want to study OCR in-depth and (2) you want additional help and support along the way. When it comes to learning Optical Character Recognition, you just can’t beat this bundle!
Additionally, the “OCR Expert” Bundle includes a Certificate of Completion. To receive the certificate, you will need to complete all lessons and quizzes associated with the text.
After successfully completing all lessons/quizzes, you will receive your certificate and be able to embed it directly on your LinkedIn profile, thereby demonstrating your Optical Character Recognition skills.
What’s next?
There you have it — the complete Table of Contents for OCR with OpenCV, Tesseract, and Python. I hope after looking over this list you’re excited as I am!
I also have some secret bonus chapters available.
If you are interested in OCR, already have OCR project ideas, or have a need for it at your company, please click the button below grab your special pre-pre-ordered copy of my OCR book:
Join the PyImageSearch Newsletter and Grab My FREE 17-page Resource Guide PDF
Enter your email address below to join the PyImageSearch Newsletter and download my FREE 17-page Resource Guide PDF on Computer Vision, OpenCV, and Deep Learning.
Comment section
Hey, Adrian Rosebrock here, author and creator of PyImageSearch. While I love hearing from readers, a couple years ago I made the tough decision to no longer offer 1:1 help over blog post comments.
At the time I was receiving 200+ emails per day and another 100+ blog post comments. I simply did not have the time to moderate and respond to them all, and the sheer volume of requests was taking a toll on me.
Instead, my goal is to do the most good for the computer vision, deep learning, and OpenCV community at large by focusing my time on authoring high-quality blog posts, tutorials, and books/courses.
If you need help learning computer vision and deep learning, I suggest you refer to my full catalog of books and courses — they have helped tens of thousands of developers, students, and researchers just like yourself learn Computer Vision, Deep Learning, and OpenCV.
Click here to browse my full catalog.