Wednesday, November 4, 2020

Socket programming with python(sending text messages and image files)

          Recently I was working with socket programming and I was amazed to learn how we can communicate between 2 machines placed remotely. With sockets, not only communicating we can send/receive any kind of data including images. So I thought of sharing this knowledge with you all. 


          We will directly go to the implementation as there are many descriptions online about the socket programming. Today I will share two implementation, first one is sending text messages between sockets and second one is sending image from server to client machine via sockets. So Let’s get started:

    1.Sending text messages between machines via sockets:
    Server Side: In the server we have a bind() method which binds it to a specific ip(or local if connected via LAN) and port so that it can listen to incoming requests on that ip and port. The server also has a listen() method which puts the server into listen mode. This allows the server to listen to incoming connections. And last a server has an accept() and close(). Please see the below code for example. You can download the code from github
           Client side: The client code will connect with the server machine with the specified ip/local and print the received massage
You need to run the server code first, once you run it, it will print “Server started listening” , then run the client code which will print “Hey! Welcome” . Here is the output from the terminals

 
2.Sending images from server machine to client via sockets:
    We will use opencv to read image data at the server and then pack the data with pickle to send it to the client machine. In the client machine we will unpack the data with pickle and then display it.

    Server code:
            Server code would be similar with the above mentioned code, except the part of reading the image with opencv and dumping it with pickle.
    Then with a while loop we will send the data as long as it takes
        After the whole data is send we will shutdown and close connection
    Client Code:
        In the client side we will get the data size first, then retrieve the data
   
Then we will convert the data for visualization and show it with opencv

    Note:
While packing and unpacking data for sockets be careful about the format. For example if it is linux -> linux , then in the ‘struct.pack’ function we need to specify it as 'L', whereas communicating with R-pi or windows it should be '=L' .

         You can download the whole code from github. Do share your comments and feedback below. Stay inside, stay safe and keep learning cheers.

Thursday, June 11, 2020

Text detection and localization with Tesseract ORC

Hello all, hope you are doing good and keeping safe. I am writing new blog post after a long break, still adjusting with the new life style. Not sure when a vaccine will come for #convid19 and we will get back to our normal life again. Anyways lets get started. Today we will learn about how to detect and localize text in image utilizing Tesseract OCR.

Tesseract is an optical character recognition engine for various operating systems. It is free software, released under the Apache License. Originally developed by Hewlett-Packard as proprietary software in the 1980s, it was released as open source in 2005 and development has been sponsored by Google since 2006.[wikipedia]



Text detection is the process of detecting and localizing where in an image text exists. In this blog, we will detect and draw the bounding box where ever text is detected. Before we actually start coding, we will learn how to install Tesseract in our system.
Step 1: Installing Tesseract 4 depends on which version of ubuntu you have. If you have ubuntu 18.04, it is super easy, just use this command
              sudo apt install tesseract-ocr
But if you have ubuntu version lower than 18.04 , follow the below commands:
              sudo add-apt-repository ppa:alex-p/tesseract-ocr
              sudo apt-get update
              sudo apt install tesseract-ocr
Once your installation is done you can check the Tesseract version by :
              tesseract -v
This is what my terminal showed when I did ‘tesseract -v’ .

Step2: Once you have installed the Tesseract, we need to install the pillow which will give binding with our python. So that we can use tesseract in our python code. Follow the below commands to install the pillow:
            pip install pillow
            pip install pytesseract
           pip install imutils
Great!! now we can start coding with python for text detection.

Step3: Open a new python file, name as you want and import the necessary packages.

Step4: Read the test image. By default opencv reads image as BGR format, but for tesseract we need RGB format, so convert the image to RGB.


Step4: Now we detect the text with tesseract’s ‘image_to_data’ function. Now we need to post process this to draw the bounding boxes.




Step5: We will walk through the text detected and get bounding box coordinates, the text and its confidence at which it was detected.This part is called text localization.




Step6: We can put a threshold to filter the weak detentions0.
Step7: Draw the bounding boxes and write the corresponding texts in the original image.


Step8: Show the image. Congratulations!! you have detected text in image.

If you want to download the whole code with the test image you can download it from here. Do let me know your feedback and comments below. Stay connected for more blog post, till than stay home stay safe.