The task is to convert doc to pdf with all formats, such as tables, images, alignments.
Creating your own converter class
Apache POI already has the WordToXxxConverter classes, namely WordToFoConverter , WordToHtmlConverter , and WordToTextConverter . The latter is most likely too insignificant to serve as an example for your requirements, but the first two are adequate.
All these converter classes are derived from the common AbstractWordConverter base class, which provides the base structure for word conversion classes. In addition, all these classes use the corresponding *DocumentFacade class, which wraps the creation of a specific target (or some intermediate) format: FoDocumentFacade , HtmlDocumentFacade, or TextDocumentFacade .
To implement the task of converting a document to pdf with all forms, such as tables, images, alignments, therefore, you should also get the converter class from AbstractWordConverter and for the implementation of abstract methods, let yourself be inspired by three specific implementation classes. Just like in other converter classes, concentrating the most specific PDF library code in the PdfDocumentFacade class seems like a good idea.
If you want to start simple and add more complex details later, you can start by using mostly WordToTextConverter implementation code and, as soon as it works to a lesser extent at the level of evidence-based concept, expand the functionality to also cover more and more formatting information.
Unfortunately, this converter infrastructure has several DOM elements: AbstractWordConverter callbacks expect and forward DOM elements as indicators of the context of the current target document; at first glance, it seems that this context is not a DOM element, so you can get rid of copying this base class and exchanging these parameters of the DOM element with a more apropos type or even with a better class parameter.
Using existing Word-to-XXX converters in combination with existing XXX-Pdf converters
If this seems too complicated or time-consuming for your resources, you can try a different approach: you can try using the output of one of the existing converters mentioned above as input for another conversion to Pdf.
Using existing conversion classes will lead to results earlier, but multi-stage conversions are usually more unprofitable than single-step ones. The decision is up to you.
In the code you posted in your question, you used the iText classes. iText supports conversion from HTML to PDF with certain restrictions using the XMLWorker provided in the iText XML Worker subproject. Ancient versions of iText also used the now obsolete HTMLWorker class. Thus, using WordToHtmlConverter in combination with iText XMLWorker may be your option.
Alternatively, Apache also provides XSL FO processing in PDF. This applies to the output of WordToFoConverter can also be an option