Creating a new eng.tessdata file for a custom font in Tesseract giving an error

Converted a PDF file to .tiff , which is pretty simple.

 convert -depth 4 -density 300 -background white +matte eng.arial.pdf eng.arial.tiff 

Then do tesseract for the .tiff file -

 tesseract eng.arial.tiff eng.arial batch.nochop makebox 

Then upload the .tiff file to tesseract -

 tesseract eng.arial.tiff eng.arial.box nobatch box.train .stderr 

Detect character set used -

 unicharset_extractor *.box 

But I get this error -

 unicharset_extractor:./.libs/lt-unicharset_extractor.c:233: FATAL: couldn't find unicharset_extractor. 

And this also happens for mftraining and combine_tessdata .

UPDATE

Ran unicharset_extractor in a single-box file and still not working.

enter image description here

And this is not only with this command, but also with mftraining , cntraining and combine_tessdata .

+5
source share

Source: https://habr.com/ru/post/1243754/


All Articles