Hello all, hope you
are doing good and keeping safe. I am writing new blog post after a
long break, still adjusting with the new life style. Not sure when a
vaccine will come for #convid19
and we will get back to our normal life again. Anyways lets get
started. Today we will learn about how to detect and localize text in
image utilizing Tesseract OCR.
Text detection is the process of detecting and localizing where in an image text exists. In this blog, we will detect and draw the bounding box where ever text is detected. Before we actually start coding, we will learn how to install Tesseract in our system.
Step 1: Installing Tesseract 4 depends on which version of ubuntu you have. If you have ubuntu 18.04, it is super easy, just use this command
sudo apt install tesseract-ocr
But if you have ubuntu version lower than 18.04 , follow the below commands:
sudo add-apt-repository ppa:alex-p/tesseract-ocr
sudo apt-get update
tesseract -v
This is what my terminal showed when I did ‘tesseract -v’ .
Step2: Once you have installed the Tesseract, we need to install the pillow which will give binding with our python. So that we can use tesseract in our python code. Follow the below commands to install the pillow:
pip install pillow
Step5: We will walk through the text detected and get bounding box coordinates, the text and its confidence at which it was detected.This part is called text localization.
Step8: Show the image. Congratulations!! you have detected text in image.
Tesseract is an optical character recognition engine for various
operating systems. It is free software, released under the Apache
License. Originally developed by Hewlett-Packard as proprietary
software in the 1980s, it was released as open source in 2005 and
development has been sponsored by Google since
2006.[wikipedia]
Text detection is the process of detecting and localizing where in an image text exists. In this blog, we will detect and draw the bounding box where ever text is detected. Before we actually start coding, we will learn how to install Tesseract in our system.
Step 1: Installing Tesseract 4 depends on which version of ubuntu you have. If you have ubuntu 18.04, it is super easy, just use this command
sudo apt install tesseract-ocr
But if you have ubuntu version lower than 18.04 , follow the below commands:
sudo add-apt-repository ppa:alex-p/tesseract-ocr
sudo apt-get update
sudo apt install
tesseract-ocr
Once your installation is done you can check the Tesseract
version by :tesseract -v
This is what my terminal showed when I did ‘tesseract -v’ .
Step2: Once you have installed the Tesseract, we need to install the pillow which will give binding with our python. So that we can use tesseract in our python code. Follow the below commands to install the pillow:
pip install pillow
pip install
pytesseract
pip install imutils
Great!! now we can
start coding with python for text detection.
Step4: Read the test
image. By default opencv reads image as BGR format, but for tesseract
we need RGB format, so convert the image to RGB.
Step4: Now we detect
the text with tesseract’s ‘image_to_data’ function. Now we
need to post process this to draw the bounding boxes.
Step5: We will walk through the text detected and get bounding box coordinates, the text and its confidence at which it was detected.This part is called text localization.
Step6: We can put a
threshold to filter the weak detentions0.
Step8: Show the image. Congratulations!! you have detected text in image.
If you want to
download the whole code with the test image you can download it from
here. Do let me know your feedback and comments below. Stay connected
for more blog post, till than stay home stay safe.