hi we noticed during a DB maintenance that spicedb 1 19 is n SpiceDB #spicedb

hi, we noticed during a DB maintenance that spiced...

vad8615

05/02/2023, 10:02 AM

hi, we noticed during a DB maintenance that spicedb 1.19 is not able to reconnect to PG after a failover. The fix has been a pod restart for spicedb. Is this a known behaviour or should we investigate more? We run on AWS using RDS with multi AZ.

vroldanbet

05/02/2023, 11:54 AM

That seems unexpected. Theoretically, if

pgx

detects a connection to the backend is broken, it would remove it from the pool. I guess the usual node drain procedure for horizontally scalable databases like CRDB or Spanner does not work here (single primary), so SpiceDB has to "take the hit", but I would have expected it to eventually recover without manual intervention. I'd suggest opening an issue.

vad8615

05/02/2023, 12:03 PM

I'll try to reproduce it and then open an issue. Thank you

vad8615

05/03/2023, 11:10 AM

I'm not able to reproduce the issue locally. I've tried with a PG container which I moved to another IP (docker-compose down / up), and spice 1.19.0 pointed to a hostname defined in

/etc/hosts

(which I switch after the IP change). The main difference I can think of is that DNS resolution is not involved, but according to https://github.com/jackc/pgx/issues/913 it shouldn't be a problem

vroldanbet

05/03/2023, 1:30 PM

hmm, are you running the spicedb binary or the container for that test?

vroldanbet

05/03/2023, 1:30 PM

In case DNS resolution problems could be associated with anything around the container base image (we use chainguard images)

vad8615

05/03/2023, 2:45 PM

container in prod (k8s), binary locally. Will try that

6 Views

Previous Next