You can convert images to PDF using iText. It's hard to do OCR here, not create a PDF.
I will warn you: any OCR engine that is worth using will cost you a significant amount of money. Free and / or open source software is usually a pet project, proof of concept for some kind of algorithm. Not suitable for real world OCR applications. Tesseract is probably the best of the group, but even that has accuracy, which is much worse than commercial.
We have a commercial OCR application, and along this path I followed the assessment of engines - I would advise you to bite the bullet and contact the engine suppliers and get quotes: Abbyy (best accuracy, most expensive, slower), Expervision (fast, not so accurate , average price for the road), Nuance (average speed, accuracy and price). None of them will be written in Java, so you should plan some time to develop JNI code around your APIs.
Good luck is a big project!
source share