Document difference of string handling in Python 2/3

2025-07-23 14:39:46 +03:00 · 2011-02-10 02:15:44 +00:00 · 2011-02-10 02:15:44 +00:00 · 1a0c494417
commit 1a0c494417
parent 713b86acdf
2 changed files with 39 additions and 15 deletions
--- a/doc/src/extensions.rst
+++ b/doc/src/extensions.rst
@ -189,8 +189,9 @@ deal with Python objects adaptation:
    .. method:: getquoted()
        Subclasses or other conforming objects should return a valid SQL
-        string representing the wrapped object. The `!ISQLQuote`
+        string representing the wrapped object. In Python 3 the SQL must be
-        implementation does nothing.
+        returned in a `!bytes` object. The `!ISQLQuote` implementation does
        nothing.
    .. method:: prepare(conn)
--- a/doc/src/usage.rst
+++ b/doc/src/usage.rst
@ -233,15 +233,37 @@ the SQL string that would be sent to the database.
 .. index::
    pair: Strings; Adaptation
    single: Unicode; Adaptation
 - String types: `!str`, `!unicode` are converted in SQL string syntax.
  `!unicode` objects (`!str` in Python 3) are encoded in the connection
  `~connection.encoding` to be sent to the backend: trying to send a character
  not supported by the encoding will result in an error. Received data can be
  converted either as `!str` or `!unicode`: see :ref:`unicode-handling` for
  received, either `!str` or `!unicode`
 .. index::
    single: Buffer; Adaptation
    single: bytea; Adaptation
    single: Binary string
- String types: `!str`, `!unicode` are converted in SQL string
+- Binary types: Python types such as `!bytes`, `!bytearray`, `!buffer`,
-  syntax.  `!buffer` is converted in PostgreSQL binary string syntax,
+  `!memoryview` are converted in PostgreSQL binary string syntax, suitable for
-  suitable for :sql:`bytea` fields. When reading textual fields, either
+  :sql:`bytea` fields. Received data is returned as `!buffer` (in Python 2) or
-  `!str` or `!unicode` can be received: see
+  `!memoryview` (in Python 3).
-  :ref:`unicode-handling`.
+
  .. warning::
     PostgreSQL 9 uses by default `a new "hex" format`__ to emit :sql:`bytea`
     fields. Unfortunately this format can't be parsed by libpq versions
     before 9.0. This means that using a library client with version lesser
     than 9.0 to talk with a server 9.0 or later you may have problems
     receiving :sql:`bytea` data. To work around this problem you can set the
     `bytea_output`__ parameter to ``escape``, either in the server
     configuration or in the client session using a query such as ``SET
     bytea_output TO escape;`` before trying to receive binary data.
     .. __: http://www.postgresql.org/docs/9.0/static/datatype-binary.html
     .. __: http://www.postgresql.org/docs/9.0/static/runtime-config-client.html#GUC-BYTEA-OUTPUT
 .. index::
    single: Adaptation; Date/Time objects
@ -338,8 +360,8 @@ defined on the database connection (the `PostgreSQL encoding`__, available in
 .. __: http://www.postgresql.org/docs/9.0/static/multibyte.html
 .. __: http://docs.python.org/library/codecs.html#standard-encodings
-When reading data from the database, the strings returned are usually 8 bit
+When reading data from the database, in Python 2 the strings returned are
-`!str` objects encoded in the database client encoding::
+usually 8 bit `!str` objects encoded in the database client encoding::
    >>> print conn.encoding
    UTF8
@ -356,9 +378,10 @@ When reading data from the database, the strings returned are usually 8 bit
    >>> print type(x), repr(x)
    <type 'str'> '\xe0\xe8\xec\xf2\xf9\xa4'
-In order to obtain `!unicode` objects instead, it is possible to
+In Python 3 instead the strings are automatically *decoded* in the connection
-register a typecaster so that PostgreSQL textual types are automatically
+`~connection.encoding`, as the `!str` object can represent Unicode characters.
-*decoded* using the current client encoding::
+In Python 2 you must register a :ref:`typecaster
 <type-casting-from-sql-to-python>` in order to receive `!unicode` objects::
    >>> psycopg2.extensions.register_type(psycopg2.extensions.UNICODE, cur)
@ -375,9 +398,9 @@ the connection or globally: see the function
 .. note::
-    If you want to receive uniformly all your database input in Unicode, you
+    In Python 2, if you want to receive uniformly all your database input in
-    can register the related typecasters globally as soon as Psycopg is
+    Unicode, you can register the related typecasters globally as soon as
-    imported::
+    Psycopg is imported::
        import psycopg2
        import psycopg2.extensions