When using HTTP to call an external server from MarkLogic, the result should fit into memory, possibly several copies, depending on what you are doing. Text variables are not optimized for extremely large data. Depending on the details of your remote service, you can host big data using paginated HTTP requests (using range request headers )
Even if the 2G limit were removed, performance would be poor and unreliable: using single HTTP requests to transfer large amounts of data is becoming increasingly unreliable, as any serious network errors require a complete retry.
Alternatively, a service or a local proxy service can be added to store data in a general location, such as a mounted file system or S3, and return a data link instead of its body. Then, the xdmp: filesystem-xxx and xdmp: binary-xxx functions can be used to access the data.
Once in memory, processing large text data as single lines will be problematic. If you need access to one large object, then binary documents (internal or external) can be used for better reliability.
If the HTTP request can be converted to GET not POST, then xdmp: document-load can be used to directly stream the results to the document.
Documentation comments for xdmp: document-load allow you to use the uri prefix to rest: POST or GET to directly stream the database results, although I donβt know how to pass POST this way.
source share