You get an exception because the subprocess cannot find the binaries (tesser executable).
Installation is a three-step process:
1. Download / install the libs / binaries system levels :
For different OSs, help here. For MacOS, you can directly install it using brew.
Install Google Tesseract OCR (additional information on how to install the engine on Linux, Mac OSX, and Windows). You should be able to refer to tesseract as tesseract. If this is not the case, for example because tesseract is not in your PATH, you will have to change "tesseract_cmd" at the top of tesseract.py. Under Debian / Ubuntu, you can use the tesseract-ocr package. For Mac OS Users. please install tesseract homeprew package.
For Windows :
The installer for the old version 3.02 is available for Windows from our download page. This includes English language learning data. if you want to use another language, download the relevant training data, unzip it using 7-zip and copy the .traineddata file to 'tessdata', probably C:\Program Files\Tesseract-OCR\tessdata .
To access tesseract-OCR from anywhere, you may need to add a directory where the tesseract-OCR binaries are located in the path of the variables, possibly C:\Program Files\Tesseract-OCR .
You can download .exe from here .
2. Install the Python package
pip install pytesseract
3. Finally, you need to have the tesseract binary in PATH .
Or you can install it at runtime:
import pytesseract pytesseract.pytesseract.tesseract_cmd = '<path-to-tesseract-bin>'
For Windows :
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'
The above line will make it work temporarily, for a permanent solution add tesseract.exe to PATH - for example, PATH=%PATH%;"C:\Program Files (x86)\Tesseract-OCR ".
Also, verify that the TESSDATA_PREFIX Windows environment TESSDATA_PREFIX is set to the directory containing the tessdata directory. For instance:
TESSDATA_PREFIX = C: \ Program Files (x86) \ Tesseract-OCR
i.e. Location tessdata: C:\Program Files (x86)\Tesseract-OCR\tessdata
Your example:
from PIL import Image import pytesseract pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract' print pytesseract.image_to_string(Image.open(r'D:\new_folder\img.png'))