hi, we noticed during a DB maintenance that spiced...
# spicedb
v
hi, we noticed during a DB maintenance that spicedb 1.19 is not able to reconnect to PG after a failover. The fix has been a pod restart for spicedb. Is this a known behaviour or should we investigate more? We run on AWS using RDS with multi AZ.
v
That seems unexpected. Theoretically, if
pgx
detects a connection to the backend is broken, it would remove it from the pool. I guess the usual node drain procedure for horizontally scalable databases like CRDB or Spanner does not work here (single primary), so SpiceDB has to "take the hit", but I would have expected it to eventually recover without manual intervention. I'd suggest opening an issue.
v
I'll try to reproduce it and then open an issue. Thank you
I'm not able to reproduce the issue locally. I've tried with a PG container which I moved to another IP (docker-compose down / up), and spice 1.19.0 pointed to a hostname defined in
/etc/hosts
(which I switch after the IP change). The main difference I can think of is that DNS resolution is not involved, but according to https://github.com/jackc/pgx/issues/913 it shouldn't be a problem
v
hmm, are you running the spicedb binary or the container for that test?
In case DNS resolution problems could be associated with anything around the container base image (we use chainguard images)
v
container in prod (k8s), binary locally. Will try that
6 Views