From: Cory Benfield Date: Fri, 10 Aug 2012 13:47:13 +0000 (+0100) Subject: Document encodings and RFC compliance. X-Git-Tag: v0.13.7~10^2 X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=7a9419ce356d0be0383ec83b14e66658ea7f96be;p=services%2Fpython-requests.git Document encodings and RFC compliance. --- diff --git a/docs/user/advanced.rst b/docs/user/advanced.rst index adda9c7..0ac450d 100644 --- a/docs/user/advanced.rst +++ b/docs/user/advanced.rst @@ -343,6 +343,31 @@ To use HTTP Basic Auth with your proxy, use the `http://user:password@host/` syn "http": "http://user:pass@10.10.1.10:3128/", } +Compliance +---------- + +Requests is intended to be compliant with all relevant specifications and +RFCs where that compliance will not cause difficulties for users. This +attention to the specification can lead to some behaviour that may seem +unusual to those not familiar with the relevant specification. + +Encodings +^^^^^^^^^ + +When you receive a response, Requests makes a guess at the encoding to use for +decoding the response when you call the ``Response.text`` method. Requests +will first check for an encoding in the HTTP header, and if none is present, +will use `chardet `_ to attempt to guess +the encoding. + +The only time Requests will not do this is if no explicit charset is present +in the HTTP headers **and** the ``Content-Type`` header contains ``text``. In +this situation, +`RFC 2616 `_ +specifies that the default charset must be ``ISO-8859-1``. Requests follows +the specification in this case. If you require a different encoding, you can +manually set the ``Response.encoding`` property, or use the raw +``Request.content``. HTTP Verbs ---------- diff --git a/docs/user/quickstart.rst b/docs/user/quickstart.rst index 9b0399d..71ffea0 100644 --- a/docs/user/quickstart.rst +++ b/docs/user/quickstart.rst @@ -86,12 +86,22 @@ again:: Requests will automatically decode content from the server. Most unicode charsets are seamlessly decoded. -When you make a request, ``r.encoding`` is set, based on the HTTP headers. -Requests will use that encoding when you access ``r.text``. If ``r.encoding`` -is ``None``, Requests will make an extremely educated guess of the encoding -of the response body. You can manually set ``r.encoding`` to any encoding -you'd like, and that charset will be used. - +When you make a request, Requests makes educated guesses about the encoding of +the response based on the HTTP headers. The text encoding guessed by Requests +is used when you access ``r.text``. You can find out what encoding Requests is +using, and change it, using the ``r.encoding`` property:: + + >>> r.encoding + 'utf-8' + >>> r.encoding = 'ISO-8859-1' + +If you change the encoding, Requests will use the new value of ``r.encoding`` +whenever you call ``r.text``. + +Requests will also use custom encodings in the event that you need them. If +you have created your own encoding and registered it with the ``codecs`` +module, you can simply use the codec name as the value of ``r.encoding`` and +Requests will handle the decoding for you. Binary Response Content -----------------------