Grab image and convert to text with pytesseract

What is this code for?

It’s incredible how many things you can do with python with a couple of lines, due to the awesome community that uses this language. So, let’s dive into some lines of code to grab a part of the screen and recognize what is the text inside of the pictur to print it. In a future post we will also add other features like save it to a file, convert to an mp3 and why not, translate it automatically. Is it too much? We’ll see.

I joined two stuff made in previous posts:

grab an image from the computer screen
convert text in the image in editable text

Where is the code?

the code:

# grabscreen.py

import pyscreenshot as ImageGrab
import os
from pynput.mouse import Listener
import sys
import pytesseract


def grab(x, y, w, h):
    im = ImageGrab.grab(bbox=(x, y, w, h))
    save(im)
    ocr()


def save(im):
    im.save('im.png')
    os.startfile('im.png')

def ocr():
    pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
    print(pytesseract.image_to_string(r'im.png'))

click1 = 0
x1 = 0
y1 = 0
def on_click(x, y, button, pressed):
    global click1, x1, y1
    
    if pressed:
        if click1 == 0:
            x1 = x
            y1 = y
            click1 = 1
        else:
            grab(x1, y1, x, y)
            listener.stop()
            sys.exit()

print("Click once on top left and once on bottom right")
# with Listener(on_move=on_move, on_click=on_click, on_scroll=on_scroll) as listener:
with Listener(on_click=on_click) as listener:
    listener.join()
    # listener.stop()
    # sys.exit()