Note. When this question was asked, the correct method to extract only the body header stream was to use prefetch=False . Since then, this parameter has been renamed to stream , and the Boolean value is inverted, so you want stream=True .
The following is the original answer.
Once you use iter_content() , you must continue to use it; .text indirectly uses the same interface under the hood (via .content ).
In other words, using iter_content() in general, you must do the .text work manually:
from requests.compat import chardet r = requests.get("http://www.december.com/html/demo/hello.html", prefetch=False) peek = r.iter_content(256).next() mime = magic.from_buffer(peek, mime=True) if mime == "text/html": contents = peek + b''.join(r.iter_content(10 * 1024)) encoding = r.encoding if encoding is None:
Assuming you are using Python 3.
An alternative is to perform two queries:
r = requests.get("http://www.december.com/html/demo/hello.html", prefetch=False) mime = magic.from_buffer(r.iter_content(256).next(), mime=True) if mime == "text/html": print(r.requests.get("http://www.december.com/html/demo/hello.html").text)
Python Version 2:
r = requests.get("http://www.december.com/html/demo/hello.html", prefetch=False) peek = r.iter_content(256).next() mime = magic.from_buffer(peek, mime=True) if mime == "text/html": contents = peek + ''.join(r.iter_content(10 * 1024)) encoding = r.encoding if encoding is None: # detect encoding encoding = chardet.detect(contents)['encoding'] try: textcontent = unicode(contents, encoding, errors='replace') except (LookupError, TypeError): textcontent = unicode(contents, errors='replace') print(textcontent)