Streaming data of unknown size from client to server via HTTP in Python

Question

Streaming data of unknown size from client to server via HTTP in Python

As, unfortunately, my previous question was closed in order to be an “exact copy” of the question, while it is definitely NOT, thereby again.

This is not a duplicate of Python: HTTP Publish a large file with streaming

This applies to streaming a large file; I want to send arbitrary file fragments one by one to the same http connection. So I have a 20 MB file, and I want to open an HTTP connection, then send 1 MB, send another 1 MB, etc., until it finishes. Using the same connection, the server sees that a fragment of 20 MB appears above this connection.

Mmapping a file is what I ALSO intend to do, but it does not work when data is read from stdin. And first of all, for this second case, I am looking for this phased data feed.

Honestly, I wonder if this can be done at all - if not, I would like to know, then you can close the problem. But if it can be done, how can it be done?

+4

python http upload

Wouter Oct 13 '12 at 8:24

source share

1 answer

Vasiliy Faronov · Accepted Answer · 2012-10-13T11:06:10+0000

From the customers point of view, it is easy. You can use httplib low-level interface - putrequest , putheader , endheaders , and send - send whatever you want to the server in pieces of any size.

But you also need to indicate where your file ends.

If you know the total file size in advance, you can simply include the Content-Length header, and the server will stop reading the request body after this large number of bytes. Then the code may look like this.

 import httplib import os.path total_size = os.path.getsize('/path/to/file') infile = open('/path/to/file') conn = httplib.HTTPConnection('example.org') conn.connect() conn.putrequest('POST', '/upload/') conn.putheader('Content-Type', 'application/octet-stream') conn.putheader('Content-Length', str(total_size)) conn.endheaders() while True: chunk = infile.read(1024) if not chunk: break conn.send(chunk) resp = conn.getresponse()

If you do not know the total size in advance, the theoretical answer is chunked transfer encoding . The problem is that although it is widely used for answers, it seems less popular (albeit just as well defined) for queries. The stock of HTTP servers may not be able to process it out of the box. But if the server is also under your control, you can try to manually parse the pieces from the request body and reassemble them into the source file.

Another option is to send each piece as a separate request (with Content-Length ) over the same connection. But you still need to implement custom logic on the server. In addition, you need to maintain state between requests.

Posted on 2012-12-27. Theres a nginx module that converts interleaved requests to regular ones. It may be useful if you do not need real streaming (start processing the request before the client sends it).

Streaming data of unknown size from client to server via HTTP in Python

More articles: