I could not find the detailed documentation, and I do not feel like looking at the source code. I want to not redo the canny edge detection if this has already been done by the Tesseract engine.
This document provides an overview of the engine: https://github.com/tesseract-ocr/docs/blob/master/tesseracticdar2007.pdf
So it looks like you don't need to implement canny edge detection.
Tesseract uses the Otsu threshold to binarize the image before processing it https://github.com/tesseract-ocr/tesseract/blob/master/ccstruct/otsuthr.h
Edit: if you want the binarized image to simply create a new configuration file in "\ tessdata \ configs", add this line: tessedit_write_images True and process the image: tesseract your_image out your_config_file . Tesseract saves the binarized image as tessinput.tif .
tessedit_write_images True
tesseract your_image out your_config_file
tessinput.tif
Source: https://habr.com/ru/post/1207215/More articles:Nginx configuration leads to too many connections - ruby-on-rails-3.2Implementing the insert function - javascriptprcomp and ggbiplot: invalid value for "rot" - rEquivalent to Gemfile.lock in Maven and gradle - maven+ [UIPasteboard _accessibilityUseQuickSpeakPasteBoard]: unrecognized selector sent to class - iosExcel VBA: Get Results for Multiple Cells - vbaUsing mockito with templates - javaui-grid (3.0.0 unstable) cellFilter with date - angular-ui-gridHow to use String Attack formatting - cCan I get data in 'display: table'd divs in Excel correctly - htmlAll Articles