How to encode UTF8 file name for HTTP headers? (Python, Django)

I have a problem with HTTP headers, they are encoded in ASCII, and I want to provide a view for downloading files whose names may not be ASCII.

response['Content-Disposition'] = 'attachment; filename="%s"' % (vo.filename.encode("ASCII","replace"), ) 

I do not want to use static files for the same problem with non-ASCII file names, but in this case there will be a problem with the file system and file name encoding. (I do not know the purpose.)

I already tried urllib.quote (), but throws a KeyError exception.

Maybe I'm doing something wrong, but maybe this is not possible.

+44
python django escaping
Sep 01 '09 at 10:00
source share
6 answers

This is a FAQ.

There is no compatible way to do this. Some browsers implement proprietary extensions (IE, Chrome), others implement RFC 2231 (Firefox, Opera).

See test examples http://greenbytes.de/tech/tc2231/ .

Update: as of November 2012, all current desktop browsers support the encoding defined in RFC 6266 and RFC 5987 (Safari> = 6, IE> = 9, Chrome, Firefox, Opera, Konqueror).

+36
Sep 01 '09 at 10:11
source

Do not send the file name to Content-Disposition. Unable to configure cross-browser (*) non-ASCII header parameters.

Instead, send only “Content-Disposition: attachment” and leave the file name as a UTF-8 string encoded in the URL at the end (PATH_INFO) of your URL so the browser can select and use by default. UTF-8 URLs are handled much more reliably by browsers than anything related to Content-Disposition.

(*: in fact, there is not even a current standard that says how this should be done, since the relationship between RFC 2616, 2231 and 2047 is rather dysfunctional, something that Julian is trying to clarify at the specification level. browser support in the distant future.)

+30
01 Sep '09 at 23:41
source

Please note that in 2011 RFC 6266 (especially Appendix D) in this matter was said and contains specific recommendations.

Namely, you can issue filename only with ASCII characters, and then filename* with the file name of the RFC 5987 format for those agents who understand it.

This will usually look like filename="my-resume.pdf"; filename*=UTF-8''My%20R%C3%A9sum%C3%A9.pdf filename="my-resume.pdf"; filename*=UTF-8''My%20R%C3%A9sum%C3%A9.pdf , where the Unicode file name ("My Résumé.pdf") is encoded in UTF-8 and then in percent (note, DO NOT use + for spaces).

Please really read RFC 6266 and RFC 5987 (or use a reliable and tested library that abstracts this for you), as my resume here does not contain important details.

+28
Jan 25 2018-12-12T00:
source

I can say that I was successful using the new ( RFC 5987 ) header format encoded with email form ( RFC 2231 ). I came up with the following solution based on the code from the django-sendfile project.

 import unicodedata from django.utils.http import urlquote def rfc5987_content_disposition(file_name): ascii_name = unicodedata.normalize('NFKD', file_name).encode('ascii','ignore').decode() header = 'attachment; filename="{}"'.format(ascii_name) if ascii_name != file_name: quoted_name = urlquote(file_name) header += '; filename*=UTF-8\'\'{}'.format(quoted_name) return header # eg # request['Content-Disposition'] = rfc5987_content_disposition(file_name) 

I just tested my Python 3.4 code with Django 1.8 . Thus, a similar solution in django-sendfile may be better for you to change.

There is a long ticket in the jackgo tracker that confirms this, but patches have not yet been offered by afaict. Unfortunately, this is as close to using a trusted trusted library as I could find, please let me know if there is a better solution.

+4
Nov 13 '17 at 10:14
source

Starting in 2018, the solution is now available in Django 2.1 (after languishing for seven years as an open ticket ). You can use the as_attachment parameter built into FileResponse . For example, to return an output_file file of type mime output_mime_type as an HTTP response:

 response = FileResponse(open(output_file, 'rb'), as_attachment=True, content_type=output_mime_type) return response 

Or, if you cannot use FileResponse , you can use the appropriate part from its source to more directly modify the Content-Disposition . Here's what this source currently looks like:

 from urllib.parse import quote try: document.file_name.encode('ascii') file_expr = 'filename="{}"'.format(filename) except UnicodeEncodeError: # Handle a non-ASCII filename file_expr = "filename*=utf-8''{}".format(quote(filename)) response['Content-Disposition'] = 'attachment; {}'.format(file_expr) 
+1
Dec 05 '18 at 0:28
source

Breaking:

 if (Request.UserAgent.Contains("IE")) { // IE will accept URL encoding, but spaces don't need to be, and since they're so common.. filename = filename.Replace("%", "%25").Replace(";", "%3B").Replace("#", "%23").Replace("&", "%26"); } 
0
Jun 29 '10 at 20:58
source



All Articles