Validation against a schema can be performed with almost zero memory. The UPA restriction ensures that validation against a content model never requires a refund. Of course, you need to track your state in the FSM content model for each element in the stack, that is, in memory proportional to the maximum depth of the document.
Identification of ID / IDREF is an exception: for this, the processor needs memory proportional to the number of ID and IDREF values found. Roughly, the processor remembers all the identifiers and IDREFs found, and when it reaches the end of the document, it checks that the identifier does not appear twice and that each IDREF appears among the identifiers. Similarly, to verify a unique / key / keyref, the processor must remember which key values were found. But the memory required for this is much less than "storing all XML in memory."
source share