Converted a PDF file to .tiff , which is pretty simple.
convert -depth 4 -density 300 -background white +matte eng.arial.pdf eng.arial.tiff
Then do tesseract for the .tiff file -
tesseract eng.arial.tiff eng.arial batch.nochop makebox
Then upload the .tiff file to tesseract -
tesseract eng.arial.tiff eng.arial.box nobatch box.train .stderr
Detect character set used -
unicharset_extractor *.box
But I get this error -
unicharset_extractor:./.libs/lt-unicharset_extractor.c:233: FATAL: couldn't find unicharset_extractor.
And this also happens for mftraining and combine_tessdata .
UPDATE
Ran unicharset_extractor in a single-box file and still not working.

And this is not only with this command, but also with mftraining , cntraining and combine_tessdata .
mjosh source share