Added cursor.itersize

The value is used to control the number of records to fetch per network
roundtrip in named cursors iteration. Used to avoid the inefficient
arraysize default of 1 without giving this value the magic meaning of
2000.
This commit is contained in:
Daniele Varrazzo 2011-02-17 12:29:07 +00:00
parent 7756eae573
commit 63ac6cdde5
6 changed files with 32 additions and 21 deletions

3
NEWS
View File

@ -9,7 +9,8 @@ What's new in psycopg 2.4
type.
- Added `register_composite()` function to cast PostgreSQL composite types
into Python tuples/namedtuples.
- More efficient iteration on named cursors.
- More efficient iteration on named cursors, fetching `itersize` records at
time from the backend.
- The build script refuses to guess values if pg_config is not found.
- Connections and cursors are weakly referenceable.
- Added 'b' and 't' mode to large objects: write can deal with both bytes

View File

@ -210,9 +210,9 @@ The ``cursor`` class
.. versionchanged:: 2.4
iterating over a :ref:`named cursor <server-side-cursors>`
fetches `~cursor.arraysize` records at time from the backend.
fetches `~cursor.itersize` records at time from the backend.
Previously only one record was fetched per roundtrip, resulting
in unefficient iteration.
in a large overhead.
.. method:: fetchone()
@ -306,18 +306,20 @@ The ``cursor`` class
time with `~cursor.fetchmany()`. It defaults to 1 meaning to fetch
a single row at a time.
The attribute is also used when iterating a :ref:`named cursor
<server-side-cursors>`: when syntax such as ``for i in cursor:`` is
used, in order to avoid an excessive number of network roundtrips, the
cursor will actually fetch `!arraysize` records at time from the
backend. For this task the default value of 1 is a poor value: if
`!arraysize` is 1, a default value of 2000 will be used instead. If
you really want to retrieve one record at time from the backend use
`fetchone()` in a loop.
.. versionchanged:: 2.4
`!arraysize` used in named cursor iteration.
.. attribute:: itersize
Read/write attribute specifying the number of rows to fetch from the
backend at each network roundtrip during :ref:`iteration
<cursor-iterable>` on a :ref:`named cursor <server-side-cursors>`. The
default is 2000.
.. versionadded:: 2.4
.. extension::
The `itersize` attribute is a Psycopg extension to the |DBAPI|.
.. attribute:: rowcount

View File

@ -529,6 +529,14 @@ allowing the user to move in the dataset using the `~cursor.scroll()`
method and to read the data using `~cursor.fetchone()` and
`~cursor.fetchmany()` methods.
Named cursors are also :ref:`iterable <cursor-iterable>` like regular cursors.
Notice however that before Psycopg 2.4 iteration was performed fetching one
record at time from the backend, resulting in a large overhead. The attribute
`~cursor.itersize` now controls how many records are now fetched at time
during the iteration: the default value of 2000 allows to fetch about 100KB
per roundtrip assuming records of 10-20 columns of mixed number and strings;
you may decrease this value if you are dealing with huge records.
.. |DECLARE| replace:: :sql:`DECLARE`
.. _DECLARE: http://www.postgresql.org/docs/9.0/static/sql-declare.html

View File

@ -45,6 +45,7 @@ typedef struct {
long int rowcount; /* number of rows affected by last execute */
long int columns; /* number of columns fetched from the db */
long int arraysize; /* how many rows should fetchmany() return */
long int itersize; /* how many rows should iter(cur) fetch in named cursors */
long int row; /* the row counter for fetch*() operations */
long int mark; /* transaction marker, copied from conn */

View File

@ -809,12 +809,8 @@ psyco_curs_next_named(cursorObject *self)
if (self->row >= self->rowcount) {
char buffer[128];
/* fetch 'arraysize' records, but shun the default value of 1 */
long int size = self->arraysize;
if (size == 1) { size = 2000L; }
PyOS_snprintf(buffer, 127, "FETCH FORWARD %ld FROM %s",
size, self->name);
self->itersize, self->name);
if (pq_execute(self, buffer, 0) == -1) return NULL;
if (_psyco_curs_prefetch(self) < 0) return NULL;
}
@ -1620,6 +1616,8 @@ static struct PyMemberDef cursorObject_members[] = {
{"arraysize", T_LONG, OFFSETOF(arraysize), 0,
"Number of records `fetchmany()` must fetch if not explicitly " \
"specified."},
{"itersize", T_LONG, OFFSETOF(itersize), 0,
"Number of records ``iter(cur)`` must fetch per network roundtrip."},
{"description", T_OBJECT, OFFSETOF(description), READONLY,
"Cursor description as defined in DBAPI-2.0."},
{"lastrowid", T_LONG, OFFSETOF(lastoid), READONLY,
@ -1682,6 +1680,7 @@ cursor_setup(cursorObject *self, connectionObject *conn, const char *name)
self->pgres = NULL;
self->notuples = 1;
self->arraysize = 1;
self->itersize = 2000;
self->rowcount = -1;
self->lastoid = InvalidOid;

View File

@ -145,7 +145,7 @@ class CursorTests(unittest.TestCase):
% (t2 - t1))
@skip_before_postgres(8, 0)
def test_iter_named_cursor_default_arraysize(self):
def test_iter_named_cursor_default_itersize(self):
curs = self.conn.cursor('tmp')
curs.execute('select generate_series(1,50)')
rv = [ (r[0], curs.rownumber) for r in curs ]
@ -153,9 +153,9 @@ class CursorTests(unittest.TestCase):
self.assertEqual(rv, [(i,i) for i in range(1,51)])
@skip_before_postgres(8, 0)
def test_iter_named_cursor_arraysize(self):
def test_iter_named_cursor_itersize(self):
curs = self.conn.cursor('tmp')
curs.arraysize = 30
curs.itersize = 30
curs.execute('select generate_series(1,50)')
rv = [ (r[0], curs.rownumber) for r in curs ]
# everything swallowed in two gulps