Document difference of string handling in Python 2/3

This commit is contained in:
Daniele Varrazzo 2011-02-10 02:15:44 +00:00
parent 713b86acdf
commit 1a0c494417
2 changed files with 39 additions and 15 deletions

View File

@ -189,8 +189,9 @@ deal with Python objects adaptation:
.. method:: getquoted() .. method:: getquoted()
Subclasses or other conforming objects should return a valid SQL Subclasses or other conforming objects should return a valid SQL
string representing the wrapped object. The `!ISQLQuote` string representing the wrapped object. In Python 3 the SQL must be
implementation does nothing. returned in a `!bytes` object. The `!ISQLQuote` implementation does
nothing.
.. method:: prepare(conn) .. method:: prepare(conn)

View File

@ -233,15 +233,37 @@ the SQL string that would be sent to the database.
.. index:: .. index::
pair: Strings; Adaptation pair: Strings; Adaptation
single: Unicode; Adaptation single: Unicode; Adaptation
- String types: `!str`, `!unicode` are converted in SQL string syntax.
`!unicode` objects (`!str` in Python 3) are encoded in the connection
`~connection.encoding` to be sent to the backend: trying to send a character
not supported by the encoding will result in an error. Received data can be
converted either as `!str` or `!unicode`: see :ref:`unicode-handling` for
received, either `!str` or `!unicode`
.. index::
single: Buffer; Adaptation single: Buffer; Adaptation
single: bytea; Adaptation single: bytea; Adaptation
single: Binary string single: Binary string
- String types: `!str`, `!unicode` are converted in SQL string - Binary types: Python types such as `!bytes`, `!bytearray`, `!buffer`,
syntax. `!buffer` is converted in PostgreSQL binary string syntax, `!memoryview` are converted in PostgreSQL binary string syntax, suitable for
suitable for :sql:`bytea` fields. When reading textual fields, either :sql:`bytea` fields. Received data is returned as `!buffer` (in Python 2) or
`!str` or `!unicode` can be received: see `!memoryview` (in Python 3).
:ref:`unicode-handling`.
.. warning::
PostgreSQL 9 uses by default `a new "hex" format`__ to emit :sql:`bytea`
fields. Unfortunately this format can't be parsed by libpq versions
before 9.0. This means that using a library client with version lesser
than 9.0 to talk with a server 9.0 or later you may have problems
receiving :sql:`bytea` data. To work around this problem you can set the
`bytea_output`__ parameter to ``escape``, either in the server
configuration or in the client session using a query such as ``SET
bytea_output TO escape;`` before trying to receive binary data.
.. __: http://www.postgresql.org/docs/9.0/static/datatype-binary.html
.. __: http://www.postgresql.org/docs/9.0/static/runtime-config-client.html#GUC-BYTEA-OUTPUT
.. index:: .. index::
single: Adaptation; Date/Time objects single: Adaptation; Date/Time objects
@ -338,8 +360,8 @@ defined on the database connection (the `PostgreSQL encoding`__, available in
.. __: http://www.postgresql.org/docs/9.0/static/multibyte.html .. __: http://www.postgresql.org/docs/9.0/static/multibyte.html
.. __: http://docs.python.org/library/codecs.html#standard-encodings .. __: http://docs.python.org/library/codecs.html#standard-encodings
When reading data from the database, the strings returned are usually 8 bit When reading data from the database, in Python 2 the strings returned are
`!str` objects encoded in the database client encoding:: usually 8 bit `!str` objects encoded in the database client encoding::
>>> print conn.encoding >>> print conn.encoding
UTF8 UTF8
@ -356,9 +378,10 @@ When reading data from the database, the strings returned are usually 8 bit
>>> print type(x), repr(x) >>> print type(x), repr(x)
<type 'str'> '\xe0\xe8\xec\xf2\xf9\xa4' <type 'str'> '\xe0\xe8\xec\xf2\xf9\xa4'
In order to obtain `!unicode` objects instead, it is possible to In Python 3 instead the strings are automatically *decoded* in the connection
register a typecaster so that PostgreSQL textual types are automatically `~connection.encoding`, as the `!str` object can represent Unicode characters.
*decoded* using the current client encoding:: In Python 2 you must register a :ref:`typecaster
<type-casting-from-sql-to-python>` in order to receive `!unicode` objects::
>>> psycopg2.extensions.register_type(psycopg2.extensions.UNICODE, cur) >>> psycopg2.extensions.register_type(psycopg2.extensions.UNICODE, cur)
@ -375,9 +398,9 @@ the connection or globally: see the function
.. note:: .. note::
If you want to receive uniformly all your database input in Unicode, you In Python 2, if you want to receive uniformly all your database input in
can register the related typecasters globally as soon as Psycopg is Unicode, you can register the related typecasters globally as soon as
imported:: Psycopg is imported::
import psycopg2 import psycopg2
import psycopg2.extensions import psycopg2.extensions