There are three types of forms in PDF:
- Forms using AcroForm technology. In this case, each field corresponds to one or more widgets with fixed positions on certain pages. The form is described using only the PDF syntax.
- Dynamic forms using the XML Form Architecture (XFA). In this case, the PDF file is nothing more than a container for the XML file that describes the entire form. We call this dynamic XFA because the form can expand or contract based on the data added: a 1-page form can turn into a 100-page form by adding more data.
- Hybrid forms combining AcroForm technology and XFA. In this case, the form is described twice: once using PDF objects; after using XML. Obviously, this form is not dynamic: part of AcroForm still defines widget annotations, which are defined in absolute positions on certain pages. A form cannot adapt to its data.
If you have a dynamic XFA form, removing XML will remove the full form. There is nothing left to do.
However, it seems that you are faced with a hybrid form that consists of AcroForm syntax and XFA. Hybrid forms are pain because they often lead to confusion. For example: a viewer who is not knowledgeable about XFA will show you the data stored in AcroForm. A viewer who knows XFA may prefer data stored in XFA form. What is the problem, you ask? Are both forms equivalent?
Ideally, both versions of the form are really equivalent, but:
- If the form is not filled out correctly, AcroForm may be different from the XFA form.
- XFA has more functionality than AcroForm technology. For example: a text field in the form of an XFA may be justified (similar to
<p align="justify"> in HTML). However, this option does not exist in the AcroForm text box (you can only have left, center, or right alignment). Therefore, if you have text that is justified in the form of XFA, but you look only at AcroForm, then the text will not be justified (because there is no justified text in the AcroForm text box).
This is a long answer to explain that if you have a hybrid form, in most cases it's okay to throw away some of the XFA. You may have slight differences, but if you are fine with what the form looks like in the Ubuntu Document Viewer (a viewer that does not support XFA), then you should be fine.
DISCLAIMER: I am the CEO of iText Group. Pdftk is a third-party tool based on an obsolete and no longer supported version of iText. iText Group does not support the use of Pdftk.
source share