Some extra bonus refactoring to improve the function readability (don't
reuse names for variables with different refcount rules, don't pass
separate obj/self, async pass-through...)
From the DB-API (https://www.python.org/dev/peps/pep-0249/):
OperationalError
Exception raised for errors that are related to the database's
operation and not necessarily under the control of the programmer,
e.g. an unexpected disconnect occurs, [...]
Additionally, psycopg2 was inconsistent, at least in the async case:
depending on how the "connection closed" error was reported from the
kernel to libpq, it would sometimes raise OperationalError and
sometimes DatabaseError. Now it always raises OperationalError.
There's a race condition that only seems to happen over Unix-domain
sockets. Sometimes, the closed socket is reported by the kernel to
libpq like this (captured with strace):
sendto(3, "Q\0\0\0\34select pg_backend_pid()\0", 29, MSG_NOSIGNAL, NULL, 0) = 29
recvfrom(3, "E\0\0\0mSFATAL\0C57P01\0Mterminating "..., 16384, 0, NULL, NULL) = 110
recvfrom(3, 0x12d0330, 16384, 0, 0, 0) = -1 ECONNRESET (Connection reset by peer)
That is, psycopg2/libpq sees no error when sending the first query
after the connection is closed, but gets an error reading the result.
In that case, everything worked fine.
But sometimes, the error manifests like this:
sendto(3, "Q\0\0\0\34select pg_backend_pid()\0", 29, MSG_NOSIGNAL, NULL, 0) = -1 EPIPE (Broken pipe)
recvfrom(3, "E\0\0\0mSFATAL\0C57P01\0Mterminating "..., 16384, 0, NULL, NULL) = 110
recvfrom(3, "", 16274, 0, NULL, NULL) = 0
recvfrom(3, "", 16274, 0, NULL, NULL) = 0
i.e. libpq received an error when sending the query. This manifests as
a slightly different exception from a slightly different place. More
importantly, in this case connection.closed is left at 0 rather than
being set to 2, and that is the bug I'm fixing here.
Note that we see almost identical behaviour for sync and async
connections, and the fixes are the same. So I added extremely similar
test cases.
Finally, there is still a bug here: for async connections, we
sometimes raise DatabaseError (incorrect) and sometimes raise
OperationalError (correct). Will fix that next.
(almost... except for micros rounding)
While this is probably an improvement on the previous implementation,
I am largely waving a dead chicken at windows, which keeps failing to
pass the seconds overflow test. If it doesn't pass now either I'll start
blaming Python's timedelta.
When moving from autocommit True -> False reset only the server
parameters that were actually specified by psycopg to honour the
serssion characteristics.
Added function 'timeradd'.
Changed second parameter of 'gettimeofday' to void since not used in
function and MSVC timezone definition is not a struct).
Store the state in the connection object and set the params on BEGIN
Some tests fail: a few can be fixed reading transaction_* instead of
default_transaction_*; but the behaviour of tx characteristics with
autocommit is effectively changed. It may be addressed by setting
default_transaction_* if autocommit is set.
Store the encode/decode functions for the right codec in the connection.
The Python encoding name has been dropped of the connection to avoid the
temptation to use it...
Code paths to read encoding on connection and to store the new
connection in the structure after changing it in the backend unified
into a single function.
A replication connection - marked by the use of the keyword "replication"
in the DSN - does not support SET commands. Trying to sent "SET datestyle"
will result in an exception.
PGRES_COPY_BOTH was introduced in 9.1: we can ifdef the hell out of
pgpath, but we may as well bury the dead horses instead of beating
them.
They smell funny, too.
Stops warning (caused by command line definition of PG_VERSION, so it
could have been avoided otherwise), but the file comment says:
Note that the definitions here are not intended to be exposed to clients
of the frontend interface libraries --- so we don't worry much about
polluting the namespace with lots of stuff...
so it doesn't seem a good idea gulping it.
Would help using adapt(unicode) to quote strings without a connection,
see ticket #331.
Currently in heisenbug state: if test_connection_wins_anyway and
test_encoding_default run (in this order), the latter fail because the
returned value is "'\xe8 '", with an extra space. Skipping the first
test, the second succeed.
The bad value is returned by the libpq:
ql = PQescapeString(to+eq+1, from, len);
just returns len = 2 and an extra space in the string... meh.
The type 'long' with Windows Visual C is 32bits in size for both 32bit and 64bit platforms. Changed type of variables that could be > 2GB from long to Py_ssize_t.
Suggested by Craig Ringer in pull request #353, should also give more
information for other cases we were reported on flaky servers (AWS,
digital ocean...), see bug #281.
The libpq's PQconsumeInput() returns 0 in case of an error only, but
we need to know if it was able to actually read something. Work
around this by setting an internal flag before retry.
This change exposes lower level functions for operating the
(logical) replication protocol, while keeping the high-level
start_replication function that does all the job for you in
case of a synchronous connection.
A number of other changes and fixes are put into this commit.
Move libpq-specific code for streaming replication support into a
separate file. Also provide gettimeofday() on Win32, implementation
copied from Postgres core.
Introduce ReplicationConnection and ReplicationCursor classes, that
incapsulate initiation of special type of PostgreSQL connection and
handling of special replication commands only available in this special
connection mode.
The handling of stream of replication data from the server is modelled
largely after the existing support for "COPY table TO file" command and
pg_recvlogical tool supplied with PostgreSQL (though, it can also be
used for physical replication.)
Calls PQconninfoParse to parse the dsn into a list of keyword and value
structs, then constructs a dictionary from that. Can be useful when one
needs to alter some part of the the connection string reliably, but
doesn't want to get into all the details of parsing a dsn string:
quoting, URL format, etc.
Multithreaded programs using libcrypto (part of OpenSSL) need to set up
callbacks to ensure safe execution. Both Python and libpq set up those
callbacks, which might lead to a conflict.
To avoid leaving dangling function pointers when being unloaded, libpq sets up
and removes the callbacks every time a SSL connection it opened and closed. If
another Python thread is performing unrelated SSL operations (like connecting
to a HTTPS server), this might lead to deadlocks, as described in
http://www.postgresql.org/message-id/871tlzrlkq.fsf@wulczer.org
Even if the problem will be remediated in libpq, it's still useful to have it
fixed in psycopg2. The solution is to use Python's own libcrypto callbacks and
completely disable handling them in libpq.
This is for people using dtuple.py; a dtuple.DatabaseTuple instance
keeps a reference to cursor.description, which is not picklable because
psycopg2 doesn't export the Column namedtuple it uses.
This commit exports the Column namedtuple, and includes a test to verify
the pickle/unpickle works after exporting Column.
If psycopg supports lo64 but the server doesn't the user may pass values
that would overflow the api range, resulting in:
lo.seek((2<<30))
*** OperationalError: ERROR: invalid seek offset: -2147483648
Also improved the error messages and guard against INT_MIN for negative
seek offsets.
`close()` is implicitly called by `__exit__()`, so an exit on error
would run a query on a inerr connection, causing another exception
hiding the original one. The fix is on `close()`, not on `__exit__()`,
because the semantic of the latter is simply to call the former.
Closes#262.
Deallocating closed large objects failed to decrement the connection
refcount. The fact the lobject is closed doesn't matter for refcount.
Issue detected by the always useful scripts/refcounter.py
With an extra bit of unrequested whitespace love.
This makes possible to import _psycopg directly, after adding the
package directory to the pythonpath. This enables hacks such as:
sys.path.insert(0, '/path/to/psycopg2')
import _psycopg
sys.modules['psycopg2._psycopg'] = _psycopg
sys.path.pop(0)
which can work around e.g. the problem of #201, freeze that cannot
freeze psycopg2. Well, freeze cannot freeze it because it's just not
designed to deal with C extensions. At least now the frozen application
can hack the pythonpath and work around the limitation by importing
_psycopg as above and then doing the rest of the imports normally.
Keeping long-lived references to python objects is bad anyway: the
tz module couldn't be reloaded before.
Introduced in 2.0 beta 8, 2006 A.D. Went absolutely untouched in 8 years
of refactoring, when Python 2.5 and PostgreSQL 8.1 roamed the earth.
I would say it has stood the test of the time.
Building without extensions has been long broken and nobody really cares
about a pure-DBAPI implementation (which could be created using a wrapper
instead).
Also, don't start an implicit transaction when fetching with
named with hold cursor, since it already returns results
from a previously committed transaction.
The default repr is enough: it prints <TypeName at 0xADDR> instead of
<TypeName object at 0xADDR>.
The only people being hurt by this change are the ones using doctests:
they deserve it.
This happens for Socket connections, not for TCP ones, where a result
containing an error is returned and correctly handled by pq_raise()
Closes ticket #196 but not #192: poll() still doesn't change the
connection closed.
The moment it is called shouldn't have really changed, but it's more
explicit when it happens. Previously it was sort of obfuscated behind a
roundtrip through the green callback and poll.
Dropped encoding parameter in the constructor: it is used
nowhere and not documented. Use directly the connection
encoding if available, else the previous latin1 fallback.
tp_clear should only be used to break the reference cycles. tp_clear was
causing a segfault because it was called twice (by the gc and by _dealloc) so
self->codec was freed twice.
Amazingly the double free was only causing a segfault on Python 3.3 (released
in late 2012) talking to Postgres 8.1 (released in 2005) in async mode... no
other combination crashed. Thank you buildbot.