How to determine the type of font to use tesseract in recognition (and not in the learning process)?

For a downloadable English dataset, I do

cat tessdata/eng.* | egrep -o ".*ttf" | sort -u

and get a list of all the fonts that were used in teaching English

Andale_Mono.ttf
Arial_Black.ttf
Arial_Bold.ttf
Arial.ttf
buttf
Comic_Sans_MS_Bold.ttf
Comic_Sans_MS.ttf
Courier_New_Bold.ttf
Courier_New.ttf
Georgia_Bold.ttf
Georgia.ttf
Gottf
Impact.ttf
Times_New_Roman_Bold.ttf
Times_New_Roman.ttf
Trebuchet_MS_Bold.ttf
Trebuchet_MS.ttf
ttf
Verdana_Bold.ttf
Verdana.ttf

Now I want to recognize text where I already know the type of font, so I want to limit it to recognition. I tried:

api.SetVariable("classify_font_name", "Arial_Bold.ttf");

but I don’t see a better result. Can someone tell me how to do this or if it is possible?

+4
source share
1 answer

LTRResultIterator WordFontAttributes . , , . . API Tesseract.

-1

Source: https://habr.com/ru/post/1538951/


All Articles