Added documentation about Unicode handling.

This commit is contained in:
Daniele Varrazzo 2010-02-11 05:17:10 +00:00 committed by Federico Di Gregorio
parent 9a6c4c4c93
commit 0694b85e59
2 changed files with 120 additions and 12 deletions

View File

@ -236,6 +236,7 @@ The module exports a few exceptions in addition to the :ref:`standard ones
It is a subclass of :exc:`~psycopg2.OperationalError`
.. index::
pair: Isolation level; Constants
@ -360,3 +361,36 @@ can be read from the :data:`~connection.status` attribute.
Used internally.
Additional database types
-------------------------
The :mod:`!extensions` module includes typecasters for many standard
PostgreSQL types. These objects allow the conversion of returned data into
Python objects. All the typecasters are automatically registered, except
:data:`UNICODE` and :data:`UNICODEARRAY`: you can register them using
:func:`register_type` in order to receive Unicode objects instead of strings
from the database. See :ref:`unicode-handling` for details.
.. data:: BINARYARRAY
.. data:: BOOLEAN
.. data:: BOOLEANARRAY
.. data:: DATE
.. data:: DATEARRAY
.. data:: DATETIMEARRAY
.. data:: DECIMALARRAY
.. data:: FLOAT
.. data:: FLOATARRAY
.. data:: INTEGER
.. data:: INTEGERARRAY
.. data:: INTERVAL
.. data:: INTERVALARRAY
.. data:: LONGINTEGER
.. data:: LONGINTEGERARRAY
.. data:: ROWIDARRAY
.. data:: STRINGARRAY
.. data:: TIME
.. data:: TIMEARRAY
.. data:: UNICODE
.. data:: UNICODEARRAY

View File

@ -247,8 +247,8 @@ the SQL string that would be sent to the database.
single: Float; Adaptation
single: Decimal; Adaptation
- Numeric objects: ``int``, ``long``, ``float``, ``Decimal`` are converted in
the PostgreSQL numerical representation::
- Numeric objects: :class:`!int`, :class:`!long`, :class:`!float`,
:class:`!Decimal` are converted in the PostgreSQL numerical representation::
>>> cur.mogrify("SELECT %s, %s, %s, %s;", (10, 10L, 10.0, Decimal("10.00")))
>>> 'SELECT 10, 10, 10.0, 10.00;'
@ -260,11 +260,11 @@ the SQL string that would be sent to the database.
single: bytea; Adaptation
single: Binary string
- String types: ``str``, ``unicode`` are converted in SQL string syntax.
``buffer`` is converted in PostgreSQL binary string syntax, suitable for
:sql:`bytea` fields.
.. todo:: unicode not working?
- String types: :class:`!str`, :class:`!unicode` are converted in SQL string
syntax. :class:`!buffer` is converted in PostgreSQL binary string syntax,
suitable for :sql:`bytea` fields. When reading textual fields, either
:class:`!str` or :class:`!unicode` can be received: see
:ref:`unicode-handling`.
.. index::
single: Date objects; Adaptation
@ -272,10 +272,11 @@ the SQL string that would be sent to the database.
single: Interval objects; Adaptation
single: mx.DateTime; Adaptation
- Date and time objects: builtin ``datetime``, ``date``, ``time``.
``timedelta`` are converted into PostgreSQL's :sql:`timestamp`, :sql:`date`,
:sql:`time`, :sql:`interval` data types. Time zones are supported too. The
Egenix `mx.DateTime`_ objects are adapted the same way::
- Date and time objects: builtin :class:`!datetime`, :class:`!date`,
:class:`!time`. :class:`!timedelta` are converted into PostgreSQL's
:sql:`timestamp`, :sql:`date`, :sql:`time`, :sql:`interval` data types.
Time zones are supported too. The Egenix `mx.DateTime`_ objects are adapted
the same way::
>>> dt = datetime.datetime.now()
>>> dt
@ -320,6 +321,79 @@ the SQL string that would be sent to the database.
.. versionadded:: 2.0.6
the tuple :sql:`IN` adaptation.
.. index::
single: Unicode
.. _unicode-handling:
Unicode handling
^^^^^^^^^^^^^^^^
Psycopg can exchange Unicode data with a PostgreSQL database. Python
:class:`!unicode` objects are automatically *encoded* in the client encoding
defined on the database connection (the `PostgreSQL encoding`__, available in
:attr:`connection.encoding`, is translated into a `Python codec`__ using an
:data:`~psycopg2.extensions.encodings` mapping)::
>>> print u, type(u)
àèìòù€ <type 'unicode'>
>>> cur.execute("INSERT INTO test (num, data) VALUES (%s,%s);", (74, u))
.. __: http://www.postgresql.org/docs/8.4/static/multibyte.html
.. __: http://docs.python.org/library/codecs.html#standard-encodings
When reading data from the database, the strings returned are usually 8 bit
:class:`!str` objects encoded in the database client encoding::
>>> print conn.encoding
UTF8
>>> cur.execute("SELECT data FROM test WHERE num = 74")
>>> x = cur.fetchone()[0]
>>> print x, type(x), repr(x)
àèìòù€ <type 'str'> '\xc3\xa0\xc3\xa8\xc3\xac\xc3\xb2\xc3\xb9\xe2\x82\xac'
>>> conn.set_client_encoding('LATIN9')
>>> cur.execute("SELECT data FROM test WHERE num = 74")
>>> x = cur.fetchone()[0]
>>> print type(x), repr(x)
<type 'str'> '\xe0\xe8\xec\xf2\xf9\xa4'
In order to obtain :class:`!unicode` objects instead, it is possible to
register a typecaster so that PostgreSQL textual types are automatically
*decoded* using the current client encoding::
>>> psycopg2.extensions.register_type(psycopg2.extensions.UNICODE, cur)
>>> cur.execute("SELECT data FROM test WHERE num = 74")
>>> x = cur.fetchone()[0]
>>> print x, type(x), repr(x)
àèìòù€ <type 'unicode'> u'\xe0\xe8\xec\xf2\xf9\u20ac'
In the above example, the :data:`~psycopg2.extensions.UNICODE` typecaster is
registered only on the cursor. It is also possible to register typecasters on
the connection or globally: see the function
:func:`~psycopg2.extensions.register_type` and
:ref:`type-casting-from-sql-to-python` for details.
.. note::
If you want to receive uniformly all your database input in Unicode, you
can register the related typecasters globally as soon as Psycopg is
imported::
import psycopg2
import psycopg2.extensions
psycopg2.extensions.register_type(psycopg2.extensions.UNICODE)
psycopg2.extensions.register_type(psycopg2.extensions.UNICODEARRAY)
and then forget about this story.
.. index::
pair: Server side; Cursor
pair: Named; Cursor
@ -345,7 +419,7 @@ large dataset can be examined without keeping it entirely in memory.
Server side cursor are created in PostgreSQL using the |DECLARE|_ command and
subsequently handled using :sql:`MOVE`, :sql:`FETCH` and :sql:`CLOSE` commands.
Psycopg wraps the database server side cursor in *named cursors*. A name
Psycopg wraps the database server side cursor in *named cursors*. A named
cursor is created using the :meth:`~connection.cursor` method specifying the
:obj:`!name` parameter. Such cursor will behave mostly like a regular cursor,
allowing the user to move in the dataset using the :meth:`~cursor.scroll`