`psycopg2.extras` -- Miscellaneous goodies for Psycopg 2 ============================================================= .. sectionauthor:: Daniele Varrazzo .. module:: psycopg2.extras .. testsetup:: import psycopg2.extras from psycopg2.extras import Inet create_test_table() This module is a generic place used to hold little helper functions and classes until a better place in the distribution is found. .. _cursor-subclasses: Connection and cursor subclasses -------------------------------- A few objects that change the way the results are returned by the cursor or modify the object behavior in some other way. Typically `!cursor` subclasses are passed as *cursor_factory* argument to `~psycopg2.connect()` so that the connection's `~connection.cursor()` method will generate objects of this class. Alternatively a `!cursor` subclass can be used one-off by passing it as the *cursor_factory* argument to the `!cursor()` method. If you want to use a `!connection` subclass you can pass it as the *connection_factory* argument of the `!connect()` function. .. index:: pair: Cursor; Dictionary .. _dict-cursor: Dictionary-like cursor ^^^^^^^^^^^^^^^^^^^^^^ The dict cursors allow to access to the retrieved records using an interface similar to the Python dictionaries instead of the tuples. >>> dict_cur = conn.cursor(cursor_factory=psycopg2.extras.DictCursor) >>> dict_cur.execute("INSERT INTO test (num, data) VALUES(%s, %s)", ... (100, "abc'def")) >>> dict_cur.execute("SELECT * FROM test") >>> rec = dict_cur.fetchone() >>> rec['id'] 1 >>> rec['num'] 100 >>> rec['data'] "abc'def" The records still support indexing as the original tuple: >>> rec[2] "abc'def" .. autoclass:: DictCursor .. autoclass:: DictConnection .. note:: Not very useful since Psycopg 2.5: you can use `psycopg2.connect`\ ``(dsn, cursor_factory=DictCursor)`` instead of `!DictConnection`. .. autoclass:: DictRow Real dictionary cursor ^^^^^^^^^^^^^^^^^^^^^^ .. autoclass:: RealDictCursor .. autoclass:: RealDictConnection .. note:: Not very useful since Psycopg 2.5: you can use `psycopg2.connect`\ ``(dsn, cursor_factory=RealDictCursor)`` instead of `!RealDictConnection`. .. autoclass:: RealDictRow .. index:: pair: Cursor; namedtuple `namedtuple` cursor ^^^^^^^^^^^^^^^^^^^^ .. versionadded:: 2.3 These objects require :py:func:`collections.namedtuple` to be found, so it is available out-of-the-box only from Python 2.6. Anyway, the namedtuple implementation is compatible with previous Python versions, so all you have to do is to `download it`__ and make it available where we expect it to be... :: from somewhere import namedtuple import collections collections.namedtuple = namedtuple from psycopg.extras import NamedTupleConnection # ... .. __: http://code.activestate.com/recipes/500261-named-tuples/ .. autoclass:: NamedTupleCursor .. autoclass:: NamedTupleConnection .. note:: Not very useful since Psycopg 2.5: you can use `psycopg2.connect`\ ``(dsn, cursor_factory=NamedTupleCursor)`` instead of `!NamedTupleConnection`. .. index:: pair: Cursor; Logging Logging cursor ^^^^^^^^^^^^^^ .. autoclass:: LoggingConnection :members: initialize,filter .. autoclass:: LoggingCursor .. autoclass:: MinTimeLoggingConnection :members: initialize,filter .. autoclass:: MinTimeLoggingCursor Replication cursor ^^^^^^^^^^^^^^^^^^ .. autoclass:: LogicalReplicationConnection This connection factory class can be used to open a special type of connection that is used for logical replication. Example:: from psycopg2.extras import LogicalReplicationConnection log_conn = psycopg2.connect(dsn, connection_factory=LogicalReplicationConnection) log_cur = log_conn.cursor() .. autoclass:: PhysicalReplicationConnection This connection factory class can be used to open a special type of connection that is used for physical replication. Example:: from psycopg2.extras import PhysicalReplicationConnection phys_conn = psycopg2.connect(dsn, connection_factory=PhysicalReplicationConnection) phys_cur = phys_conn.cursor() Both `LogicalReplicationConnection` and `PhysicalReplicationConnection` use `ReplicationCursor` for actual communication on the connection. .. seealso:: - PostgreSQL `Streaming Replication Protocol`__ .. __: http://www.postgresql.org/docs/current/static/protocol-replication.html .. autoclass:: ReplicationCursor .. method:: identify_system() Execute ``IDENTIFY_SYSTEM`` command of the streaming replication protocol and return the result as a dictionary. Example:: >>> cur.identify_system() {'timeline': 1, 'systemid': '1234567890123456789', 'dbname': 'test', 'xlogpos': '0/1ABCDEF'} .. method:: create_replication_slot(slot_name, output_plugin=None) Create streaming replication slot. :param slot_name: name of the replication slot to be created :param slot_type: type of replication: should be either `REPLICATION_LOGICAL` or `REPLICATION_PHYSICAL` :param output_plugin: name of the logical decoding output plugin to be used by the slot; required for logical replication connections, disallowed for physical Example:: log_cur.create_replication_slot("logical1", "test_decoding") phys_cur.create_replication_slot("physical1") # either logical or physical replication connection cur.create_replication_slot("slot1", slot_type=REPLICATION_LOGICAL) When creating a slot on a logical replication connection, a logical replication slot is created by default. Logical replication requires name of the logical decoding output plugin to be specified. When creating a slot on a physical replication connection, a physical replication slot is created by default. No output plugin parameter is required or allowed when creating a physical replication slot. In either case, the type of slot being created can be specified explicitly using *slot_type* parameter. Replication slots are a feature of PostgreSQL server starting with version 9.4. .. method:: drop_replication_slot(slot_name) Drop streaming replication slot. :param slot_name: name of the replication slot to drop Example:: # either logical or physical replication connection cur.drop_replication_slot("slot1") This Replication slots are a feature of PostgreSQL server starting with version 9.4. .. method:: start_replication(slot_name=None, writer=None, slot_type=None, start_lsn=0, timeline=0, keepalive_interval=10, options=None) Start replication on the connection. :param slot_name: name of the replication slot to use; required for logical replication, physical replication can work with or without a slot :param writer: a file-like object to write replication messages to :param slot_type: type of replication: should be either `REPLICATION_LOGICAL` or `REPLICATION_PHYSICAL` :param start_lsn: the optional LSN position to start replicating from, can be an integer or a string of hexadecimal digits in the form ``XXX/XXX`` :param timeline: WAL history timeline to start streaming from (optional, can only be used with physical replication) :param keepalive_interval: interval (in seconds) to send keepalive messages to the server :param options: a dictionary of options to pass to logical replication slot (not allowed with physical replication, set to *None*) If not specified using *slot_type* parameter, the type of replication to be started is defined by the type of replication connection. Logical replication is only allowed on logical replication connection, but physical replication can be used with both types of connection. On the other hand, physical replication doesn't require a named replication slot to be used, only logical one does. In any case, logical replication and replication slots are a feature of PostgreSQL server starting with version 9.4. Physical replication can be used starting with 9.0. If a *slot_name* is specified, the slot must exist on the server and its type must match the replication type used. When used on non-asynchronous connection this method enters an endless loop, reading messages from the server and passing them to ``write()`` method of the *writer* object. This is similar to operation of the `~cursor.copy_to()` method. It also sends keepalive messages to the server, in case there were no new data from it for the duration of *keepalive_interval* seconds (this parameter's value must be equal to at least than 1 second, but it can have a fractional part). With asynchronous connection, this method returns immediately and the calling code can start reading the replication messages in a loop. A sketch implementation of the *writer* object for logical replication might look similar to the following:: from io import TextIOBase class LogicalStreamWriter(TextIOBase): def write(self, msg): self.store_message_data(msg.payload) if self.should_report_to_the_server_now(msg): msg.cursor.send_replication_feedback(flush_lsn=msg.wal_end) First, like with the `~cursor.copy_to()` method, the code that calls the provided ``write()`` method checks if the *writer* object is inherited from `~io.TextIOBase`. If that is the case, the message payload to be passed is converted to unicode using the connection's `~connection.encoding` information. Otherwise, the message is passed as is. The *msg* object being passed is an instance of `~ReplicationMessage` class. After storing certain amount of messages' data reliably, the client should send a confirmation message to the server. This should be done by calling `~send_replication_feedback()` method on the corresponding replication cursor. A reference to the cursor is provided in the `~ReplicationMessage` as an attribute. .. warning:: Failure to properly notify the server by constantly consuming and reporting success at appropriate times can eventually lead to "disk full" condition on the server, because the server retains all the WAL segments that might be needed to stream the changes via all of the currently open replication slots. On the other hand, it is not recommended to send a confirmation after every processed message, since that will put an unnecessary load on network and the server. A possible strategy is to confirm after every COMMIT message. .. method:: stop_replication() In non-asynchronous connection, when called from the ``write()`` method, tell the code in `~start_replication` to break out of the endless loop and return. .. method:: send_replication_feedback(write_lsn=0, flush_lsn=0, apply_lsn=0, reply=False) :param write_lsn: a LSN position up to which the client has written the data locally :param flush_lsn: a LSN position up to which the client has stored the data reliably (the server is allowed to discard all and every data that predates this LSN) :param apply_lsn: a LSN position up to which the warm standby server has applied the changes (physical replication master-slave protocol only) :param reply: request the server to send back a keepalive message immediately Use this method to report to the server that all messages up to a certain LSN position have been stored and may be discarded. This method can also be called with all default parameters' values to send a keepalive message to the server. In case of asynchronous connection, if the feedback message cannot be sent at the moment, remembers the passed LSN positions for a later hopefully successful call or call to `~flush_replication_feedback()`. .. method:: flush_replication_feedback(reply=False) :param reply: request the server to send back a keepalive message immediately This method tries to flush the latest replication feedback message that `~send_replication_feedback()` was trying to send, if any. Low-level methods for asynchronous connection operation. With the non-asynchronous connection, a single call to `~start_replication()` handles all the complexity, but at times it might be beneficial to use low-level interface for better control, in particular to `~select.select()` on multiple sockets. The following methods are provided for asynchronous operation: .. method:: read_replication_message(decode=True) :param decode: a flag indicating that unicode conversion should be performed on the data received from the server This method should be used in a loop with asynchronous connections after calling `~start_replication()` once. It tries to read the next message from the server, without blocking and returns an instance of `~ReplicationMessage` or *None*, in case there are no more data messages from the server at the moment. It is expected that the calling code will call this method repeatedly in order to consume all of the messages that might have been buffered, until *None* is returned. After receiving a *None* value from this method, one might use `~select.select()` or `~select.poll()` on the corresponding connection to block the process until there is more data from the server. The server can send keepalive messages to the client periodically. Such messages are silently consumed by this method and are never reported to the caller. .. method:: fileno() Call the corresponding connection's `~connection.fileno()` method and return the result. This is a convenience method which allows replication cursor to be used directly in `~select.select()` or `~select.poll()` calls. .. attribute:: replication_io_timestamp A `~datetime` object representing the timestamp at the moment of last communication with the server (a data or keepalive message in either direction). An actual example of asynchronous operation might look like this:: keepalive_interval = 10.0 while True: if (datetime.now() - cur.replication_io_timestamp).total_seconds() >= keepalive_interval: cur.send_replication_feedback() while True: msg = cur.read_replication_message() if not msg: break writer.write(msg) timeout = keepalive_interval - (datetime.now() - cur.replication_io_timestamp).total_seconds() if timeout > 0: select.select([cur], [], [], timeout) .. autoclass:: ReplicationMessage .. attribute:: payload The actual data received from the server. An instance of either ``str`` or ``unicode``. .. attribute:: data_start LSN position of the start of the message. .. attribute:: wal_end LSN position of the end of the message. .. attribute:: send_time A `~datetime` object representing the server timestamp at the moment when the message was sent. .. attribute:: cursor A reference to the corresponding `~ReplicationCursor` object. .. data:: REPLICATION_LOGICAL .. data:: REPLICATION_PHYSICAL .. index:: pair: Cursor; Replication .. index:: single: Data types; Additional Additional data types --------------------- .. index:: pair: JSON; Data types pair: JSON; Adaptation .. _adapt-json: JSON_ adaptation ^^^^^^^^^^^^^^^^ .. versionadded:: 2.5 .. versionchanged:: 2.5.4 added |jsonb| support. In previous versions |jsonb| values are returned as strings. See :ref:`the FAQ ` for a workaround. Psycopg can adapt Python objects to and from the PostgreSQL |pgjson|_ and |jsonb| types. With PostgreSQL 9.2 and following versions adaptation is available out-of-the-box. To use JSON data with previous database versions (either with the `9.1 json extension`__, but even if you want to convert text fields to JSON) you can use the `register_json()` function. .. __: http://people.planetpostgresql.org/andrew/index.php?/archives/255-JSON-for-PG-9.2-...-and-now-for-9.1!.html The Python library used by default to convert Python objects to JSON and to parse data from the database depends on the language version: with Python 2.6 and following the :py:mod:`json` module from the standard library is used; with previous versions the `simplejson`_ module is used if available. Note that the last `!simplejson` version supporting Python 2.4 is the 2.0.9. .. _JSON: http://www.json.org/ .. |pgjson| replace:: :sql:`json` .. |jsonb| replace:: :sql:`jsonb` .. _pgjson: http://www.postgresql.org/docs/current/static/datatype-json.html .. _simplejson: http://pypi.python.org/pypi/simplejson/ In order to pass a Python object to the database as query argument you can use the `Json` adapter:: curs.execute("insert into mytable (jsondata) values (%s)", [Json({'a': 100})]) Reading from the database, |pgjson| and |jsonb| values will be automatically converted to Python objects. .. note:: If you are using the PostgreSQL :sql:`json` data type but you want to read it as string in Python instead of having it parsed, your can either cast the column to :sql:`text` in the query (it is an efficient operation, that doesn't involve a copy):: cur.execute("select jsondata::text from mytable") or you can register a no-op `!loads()` function with `register_default_json()`:: psycopg2.extras.register_default_json(loads=lambda x: x) .. note:: You can use `~psycopg2.extensions.register_adapter()` to adapt any Python dictionary to JSON, either registering `Json` or any subclass or factory creating a compatible adapter:: psycopg2.extensions.register_adapter(dict, psycopg2.extras.Json) This setting is global though, so it is not compatible with similar adapters such as the one registered by `register_hstore()`. Any other object supported by JSON can be registered the same way, but this will clobber the default adaptation rule, so be careful to unwanted side effects. If you want to customize the adaptation from Python to PostgreSQL you can either provide a custom `!dumps()` function to `Json`:: curs.execute("insert into mytable (jsondata) values (%s)", [Json({'a': 100}, dumps=simplejson.dumps)]) or you can subclass it overriding the `~Json.dumps()` method:: class MyJson(Json): def dumps(self, obj): return simplejson.dumps(obj) curs.execute("insert into mytable (jsondata) values (%s)", [MyJson({'a': 100})]) Customizing the conversion from PostgreSQL to Python can be done passing a custom `!loads()` function to `register_json()`. For the builtin data types (|pgjson| from PostgreSQL 9.2, |jsonb| from PostgreSQL 9.4) use `register_default_json()` and `register_default_jsonb()`. For example, if you want to convert the float values from :sql:`json` into :py:class:`~decimal.Decimal` you can use:: loads = lambda x: json.loads(x, parse_float=Decimal) psycopg2.extras.register_json(conn, loads=loads) .. autoclass:: Json .. automethod:: dumps .. autofunction:: register_json .. versionchanged:: 2.5.4 added the *name* parameter to enable :sql:`jsonb` support. .. autofunction:: register_default_json .. autofunction:: register_default_jsonb .. versionadded:: 2.5.4 .. index:: pair: hstore; Data types pair: dict; Adaptation .. _adapt-hstore: Hstore data type ^^^^^^^^^^^^^^^^ .. versionadded:: 2.3 The |hstore|_ data type is a key-value store embedded in PostgreSQL. It has been available for several server versions but with the release 9.0 it has been greatly improved in capacity and usefulness with the addition of many functions. It supports GiST or GIN indexes allowing search by keys or key/value pairs as well as regular BTree indexes for equality, uniqueness etc. Psycopg can convert Python `!dict` objects to and from |hstore| structures. Only dictionaries with string/unicode keys and values are supported. `!None` is also allowed as value but not as a key. Psycopg uses a more efficient |hstore| representation when dealing with PostgreSQL 9.0 but previous server versions are supported as well. By default the adapter/typecaster are disabled: they can be enabled using the `register_hstore()` function. .. autofunction:: register_hstore .. versionchanged:: 2.4 added the *oid* parameter. If not specified, the typecaster is installed also if |hstore| is not installed in the :sql:`public` schema. .. versionchanged:: 2.4.3 added support for |hstore| array. .. |hstore| replace:: :sql:`hstore` .. _hstore: http://www.postgresql.org/docs/current/static/hstore.html .. index:: pair: Composite types; Data types pair: tuple; Adaptation pair: namedtuple; Adaptation .. _adapt-composite: Composite types casting ^^^^^^^^^^^^^^^^^^^^^^^ .. versionadded:: 2.4 Using `register_composite()` it is possible to cast a PostgreSQL composite type (either created with the |CREATE TYPE|_ command or implicitly defined after a table row type) into a Python named tuple, or into a regular tuple if :py:func:`collections.namedtuple` is not found. .. |CREATE TYPE| replace:: :sql:`CREATE TYPE` .. _CREATE TYPE: http://www.postgresql.org/docs/current/static/sql-createtype.html .. doctest:: >>> cur.execute("CREATE TYPE card AS (value int, suit text);") >>> psycopg2.extras.register_composite('card', cur) >>> cur.execute("select (8, 'hearts')::card") >>> cur.fetchone()[0] card(value=8, suit='hearts') Nested composite types are handled as expected, provided that the type of the composite components are registered as well. .. doctest:: >>> cur.execute("CREATE TYPE card_back AS (face card, back text);") >>> psycopg2.extras.register_composite('card_back', cur) >>> cur.execute("select ((8, 'hearts'), 'blue')::card_back") >>> cur.fetchone()[0] card_back(face=card(value=8, suit='hearts'), back='blue') Adaptation from Python tuples to composite types is automatic instead and requires no adapter registration. .. _custom-composite: .. Note:: If you want to convert PostgreSQL composite types into something different than a `!namedtuple` you can subclass the `CompositeCaster` overriding `~CompositeCaster.make()`. For example, if you want to convert your type into a Python dictionary you can use:: >>> class DictComposite(psycopg2.extras.CompositeCaster): ... def make(self, values): ... return dict(zip(self.attnames, values)) >>> psycopg2.extras.register_composite('card', cur, ... factory=DictComposite) >>> cur.execute("select (8, 'hearts')::card") >>> cur.fetchone()[0] {'suit': 'hearts', 'value': 8} .. autofunction:: register_composite .. versionchanged:: 2.4.3 added support for array of composite types .. versionchanged:: 2.5 added the *factory* parameter .. autoclass:: CompositeCaster .. automethod:: make .. versionadded:: 2.5 Object attributes: .. attribute:: name The name of the PostgreSQL type. .. attribute:: schema The schema where the type is defined. .. versionadded:: 2.5 .. attribute:: oid The oid of the PostgreSQL type. .. attribute:: array_oid The oid of the PostgreSQL array type, if available. .. attribute:: type The type of the Python objects returned. If :py:func:`collections.namedtuple()` is available, it is a named tuple with attributes equal to the type components. Otherwise it is just the `!tuple` object. .. attribute:: attnames List of component names of the type to be casted. .. attribute:: atttypes List of component type oids of the type to be casted. .. index:: pair: range; Data types .. _adapt-range: Range data types ^^^^^^^^^^^^^^^^ .. versionadded:: 2.5 Psycopg offers a `Range` Python type and supports adaptation between them and PostgreSQL |range|_ types. Builtin |range| types are supported out-of-the-box; user-defined |range| types can be adapted using `register_range()`. .. |range| replace:: :sql:`range` .. _range: http://www.postgresql.org/docs/current/static/rangetypes.html .. autoclass:: Range This Python type is only used to pass and retrieve range values to and from PostgreSQL and doesn't attempt to replicate the PostgreSQL range features: it doesn't perform normalization and doesn't implement all the operators__ supported by the database. .. __: http://www.postgresql.org/docs/current/static/functions-range.html#RANGE-OPERATORS-TABLE `!Range` objects are immutable, hashable, and support the ``in`` operator (checking if an element is within the range). They can be tested for equivalence. Empty ranges evaluate to `!False` in boolean context, nonempty evaluate to `!True`. .. versionchanged:: 2.5.3 `!Range` objects can be sorted although, as on the server-side, this ordering is not particularly meangingful. It is only meant to be used by programs assuming objects using `!Range` as primary key can be sorted on them. In previous versions comparing `!Range`\s raises `!TypeError`. Although it is possible to instantiate `!Range` objects, the class doesn't have an adapter registered, so you cannot normally pass these instances as query arguments. To use range objects as query arguments you can either use one of the provided subclasses, such as `NumericRange` or create a custom subclass using `register_range()`. Object attributes: .. autoattribute:: isempty .. autoattribute:: lower .. autoattribute:: upper .. autoattribute:: lower_inc .. autoattribute:: upper_inc .. autoattribute:: lower_inf .. autoattribute:: upper_inf The following `Range` subclasses map builtin PostgreSQL |range| types to Python objects: they have an adapter registered so their instances can be passed as query arguments. |range| values read from database queries are automatically casted into instances of these classes. .. autoclass:: NumericRange .. autoclass:: DateRange .. autoclass:: DateTimeRange .. autoclass:: DateTimeTZRange .. note:: Python lacks a representation for :sql:`infinity` date so Psycopg converts the value to `date.max` and such. When written into the database these dates will assume their literal value (e.g. :sql:`9999-12-31` instead of :sql:`infinity`). Check :ref:`infinite-dates-handling` for an example of an alternative adapter to map `date.max` to :sql:`infinity`. An alternative dates adapter will be used automatically by the `DateRange` adapter and so on. Custom |range| types (created with |CREATE TYPE|_ :sql:`... AS RANGE`) can be adapted to a custom `Range` subclass: .. autofunction:: register_range .. autoclass:: RangeCaster Object attributes: .. attribute:: range The `!Range` subclass adapted. .. attribute:: adapter The `~psycopg2.extensions.ISQLQuote` responsible to adapt `!range`. .. attribute:: typecaster The object responsible for casting. .. attribute:: array_typecaster The object responsible to cast arrays, if available, else `!None`. .. index:: pair: UUID; Data types .. _adapt-uuid: UUID data type ^^^^^^^^^^^^^^ .. versionadded:: 2.0.9 .. versionchanged:: 2.0.13 added UUID array support. .. doctest:: >>> psycopg2.extras.register_uuid() >>> # Python UUID can be used in SQL queries >>> import uuid >>> my_uuid = uuid.UUID('{12345678-1234-5678-1234-567812345678}') >>> psycopg2.extensions.adapt(my_uuid).getquoted() "'12345678-1234-5678-1234-567812345678'::uuid" >>> # PostgreSQL UUID are transformed into Python UUID objects. >>> cur.execute("SELECT 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11'::uuid") >>> cur.fetchone()[0] UUID('a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11') .. autofunction:: register_uuid .. autoclass:: UUID_adapter .. index:: pair: INET; Data types :sql:`inet` data type ^^^^^^^^^^^^^^^^^^^^^^ .. versionadded:: 2.0.9 .. versionchanged:: 2.4.5 added inet array support. .. doctest:: >>> psycopg2.extras.register_inet() >>> cur.mogrify("SELECT %s", (Inet('127.0.0.1/32'),)) "SELECT E'127.0.0.1/32'::inet" >>> cur.execute("SELECT '192.168.0.1/24'::inet") >>> cur.fetchone()[0].addr '192.168.0.1/24' .. autofunction:: register_inet .. autoclass:: Inet .. index:: single: Time zones; Fractional Fractional time zones --------------------- .. autofunction:: register_tstz_w_secs .. versionadded:: 2.0.9 .. versionchanged:: 2.2.2 function is no-op: see :ref:`tz-handling`. .. index:: pair: Example; Coroutine; Coroutine support ----------------- .. autofunction:: wait_select(conn)