Some notes to add to the picture from @Millie's answer:
If you are in doubt about some aspects of PDF, the ISO 32000-1 specification should come first .
It indicates the ID entry as:
ID (required if Encrypt is present, optional otherwise; PDF 1.1)
An array of two byte strings constituting the file identifier (see 14.4, “File Identifiers”) for the file. If there is an Encrypt record, this array and two byte strings must be direct objects and must be unencrypted.
NOTE 1 Since the ID records are not encrypted, you can check the ID key to ensure that you can access the correct file without decrypting the file. Restrictions that the string is a direct object and not encrypted ensure that this is possible.
NOTE 2: Although this entry is optional, its absence may interfere with file operation in some workflows that depend on unique file identifiers.
NOTE 3 The values of the ID strings are used as input to the encryption algorithm. If these lines were indirect or the ID array was indirect, these lines would be encrypted when writing. This will lead to a cyclical condition for the reader: ID strings must be decrypted in order to use them to decrypt strings, including ID strings. The previous restriction prevents this cyclic condition.
(Table 15 - Entries in the File Trailer Dictionary)
NOTE 2 above is essentially a recommendation to add this optional value, even if it is not compiled using the SHALL / SHOULD / MAY language specifications used elsewhere in this document.
The recommendation is specified in more detail in section 14.4 of the reference:
The ID is optional, but should be used.
As it should be in these specifications, this is a recommendation, and the recommendation is defined as something that needs to be done, if there is no good reason for this, it means that a writer in PDF format must create this entry if she cannot object to it requirements (I can hardly come up with arguments against this). This should answer the question asked in response to Millie
any idea why both pdfsharp and phantomjs create it?
This is not particularly considered good practice, as suggested in another comment above.
Regarding the contents of the ID array, the specification continues in section 14.4:
The value of this entry should be an array of two byte strings. The first line of the byte should be a constant identifier based on the contents of the file at the time of its initial creation and should not change when the file is gradually updated. The second line of the byte should be a changing identifier based on the contents of the files at the time of its last update. When a file is first written, both identifiers must have the same value. If both identifiers match when the file link is resolved, it is very likely that the correct and immutable file was found. If only the first identifier matches, another version of the correct file was found.
To ensure that file identifiers are unique, they must be computed using the message digest algorithm ...
The calculation of the file identifier should not be reproducible; all that matters is that the identifier is likely to be unique.
So another article cited from is also not entirely correct in saying
a program that creates PDF files is only required to create a file identifier if the file is to be encrypted.
Even in the absence of encryption, this program should have good reason not to create file identifiers, as a recommendation in the specification. Thus, in the absence of such reasons, creating a file identifier requires .
All that said, any consumer in PDF should always be ready to find a PDF without a file identifier ... maybe the reason is not to create it.