Added cursor.itersize

The value is used to control the number of records to fetch per network roundtrip in named cursors iteration. Used to avoid the inefficient arraysize default of 1 without giving this value the magic meaning of 2000.
2025-07-15 02:32:24 +03:00 · 2011-02-17 12:29:07 +00:00 · 2011-02-17 12:29:07 +00:00 · 63ac6cdde5
commit 63ac6cdde5
parent 7756eae573
6 changed files with 32 additions and 21 deletions
--- a/3
+++ b/3
@ -9,7 +9,8 @@ What's new in psycopg 2.4
    type.
  - Added `register_composite()` function to cast PostgreSQL composite types
    into Python tuples/namedtuples.
-  - More efficient iteration on named cursors.
+  - More efficient iteration on named cursors, fetching `itersize` records at
+    time from the backend.
  - The build script refuses to guess values if pg_config is not found.
  - Connections and cursors are weakly referenceable.
  - Added 'b' and 't' mode to large objects: write can deal with both bytes
--- a/doc/src/cursor.rst
+++ b/doc/src/cursor.rst
@ -210,9 +210,9 @@ The ``cursor`` class

        .. versionchanged:: 2.4
            iterating over a :ref:`named cursor <server-side-cursors>`
-            fetches `~cursor.arraysize` records at time from the backend.
+            fetches `~cursor.itersize` records at time from the backend.
            Previously only one record was fetched per roundtrip, resulting
-             in unefficient iteration.
+            in a large overhead.

    .. method:: fetchone()

@ -306,18 +306,20 @@ The ``cursor`` class
        time with `~cursor.fetchmany()`. It defaults to 1 meaning to fetch
        a single row at a time.

-        The attribute is also used when iterating a :ref:`named cursor
-        <server-side-cursors>`: when syntax such as ``for i in cursor:`` is
-        used, in order to avoid an excessive number of network roundtrips, the
-        cursor will actually fetch `!arraysize` records at time from the
-        backend. For this task the default value of 1 is a poor value: if
-        `!arraysize` is 1, a default value of 2000 will be used instead. If
-        you really want to retrieve one record at time from the backend use
-        `fetchone()` in a loop.

-        .. versionchanged:: 2.4
-            `!arraysize` used in named cursor iteration.
+    .. attribute:: itersize
+
+        Read/write attribute specifying the number of rows to fetch from the
+        backend at each network roundtrip during :ref:`iteration
+        <cursor-iterable>` on a :ref:`named cursor <server-side-cursors>`. The
+        default is 2000.
+
+        .. versionadded:: 2.4
        
+        .. extension::
+
+            The `itersize` attribute is a Psycopg extension to the |DBAPI|.
+

    .. attribute:: rowcount 
          
--- a/doc/src/usage.rst
+++ b/doc/src/usage.rst
@ -529,6 +529,14 @@ allowing the user to move in the dataset using the `~cursor.scroll()`
 method and to read the data using `~cursor.fetchone()` and
 `~cursor.fetchmany()` methods.

+Named cursors are also :ref:`iterable <cursor-iterable>` like regular cursors.
+Notice however that before Psycopg 2.4 iteration was performed fetching one
+record at time from the backend, resulting in a large overhead. The attribute
+`~cursor.itersize` now controls how many records are now fetched at time
+during the iteration: the default value of 2000 allows to fetch about 100KB
+per roundtrip assuming records of 10-20 columns of mixed number and strings;
+you may decrease this value if you are dealing with huge records.
+
 .. |DECLARE| replace:: :sql:`DECLARE`
 .. _DECLARE: http://www.postgresql.org/docs/9.0/static/sql-declare.html

--- a/psycopg/cursor.h
+++ b/psycopg/cursor.h
@ -45,6 +45,7 @@ typedef struct {
    long int rowcount;       /* number of rows affected by last execute */
    long int columns;        /* number of columns fetched from the db */
    long int arraysize;      /* how many rows should fetchmany() return */
+    long int itersize;       /* how many rows should iter(cur) fetch in named cursors */
    long int row;            /* the row counter for fetch*() operations */
    long int mark;           /* transaction marker, copied from conn */

--- a/psycopg/cursor_type.c
+++ b/psycopg/cursor_type.c
@ -809,12 +809,8 @@ psyco_curs_next_named(cursorObject *self)
    if (self->row >= self->rowcount) {
        char buffer[128];

-        /* fetch 'arraysize' records, but shun the default value of 1 */
-        long int size = self->arraysize;
-        if (size == 1) { size = 2000L; }
-
        PyOS_snprintf(buffer, 127, "FETCH FORWARD %ld FROM %s",
-            size, self->name);
+            self->itersize, self->name);
        if (pq_execute(self, buffer, 0) == -1) return NULL;
        if (_psyco_curs_prefetch(self) < 0) return NULL;
    }
@ -1620,6 +1616,8 @@ static struct PyMemberDef cursorObject_members[] = {
    {"arraysize", T_LONG, OFFSETOF(arraysize), 0,
        "Number of records `fetchmany()` must fetch if not explicitly " \
        "specified."},
+    {"itersize", T_LONG, OFFSETOF(itersize), 0,
+        "Number of records ``iter(cur)`` must fetch per network roundtrip."},
    {"description", T_OBJECT, OFFSETOF(description), READONLY,
        "Cursor description as defined in DBAPI-2.0."},
    {"lastrowid", T_LONG, OFFSETOF(lastoid), READONLY,
@ -1682,6 +1680,7 @@ cursor_setup(cursorObject *self, connectionObject *conn, const char *name)
    self->pgres = NULL;
    self->notuples = 1;
    self->arraysize = 1;
+    self->itersize = 2000;
    self->rowcount = -1;
    self->lastoid = InvalidOid;

--- a/tests/test_cursor.py
+++ b/tests/test_cursor.py
@ -145,7 +145,7 @@ class CursorTests(unittest.TestCase):
            % (t2 - t1))

    @skip_before_postgres(8, 0)
-    def test_iter_named_cursor_default_arraysize(self):
+    def test_iter_named_cursor_default_itersize(self):
        curs = self.conn.cursor('tmp')
        curs.execute('select generate_series(1,50)')
        rv = [ (r[0], curs.rownumber) for r in curs ]
@ -153,9 +153,9 @@ class CursorTests(unittest.TestCase):
        self.assertEqual(rv, [(i,i) for i in range(1,51)])

    @skip_before_postgres(8, 0)
-    def test_iter_named_cursor_arraysize(self):
+    def test_iter_named_cursor_itersize(self):
        curs = self.conn.cursor('tmp')
-        curs.arraysize = 30
+        curs.itersize = 30
        curs.execute('select generate_series(1,50)')
        rv = [ (r[0], curs.rownumber) for r in curs ]
        # everything swallowed in two gulps