This might be the TLS connection between
# spicedb
j
This might be the TLS connection between SpiceDB instances ("dispatcher") and not the datastore
j
Thanks Jimmy! We see messages like this too, which is why I suspected the database as being the culprit:
Copy code
{
  "level": "error",
  "module": "pgx",
  "pgx": {
    "database": "spicedb",
    "err": "failed to connect to `host=REDACTED_IP_ADDRESS user=spicedb database=spicedb`: failed SASL auth (write failed: write tcp REDACTED_IP_ADDRESS:54184->REDACTED_IP_ADDRESS:5432: i/o timeout)",
    "host": "REDACTED_IP_ADDRESS",
    "port": 5432,
    "time": 35.729385
  },
  "time": "2023-08-24T17:52:47Z",
  "message": "Connect"
}
j
yeah that definitely confirms it's the conncetion to the database
Is this pod the only SpiceDB pod running on that node? I wonder if it's something to do with that particular node
What DB provider are you using? What's the SASL auth mechanism?
j
We're using Cloud SQL (PostgreSQL) with whatever the default password thing is, something we changed recently is that we configured a TLS secret for SpiceDB but I think that's only used for dispatcher/serving connections, not for outbound connections to the database, right? (Also if it was a cert issue I'd have expected an x509 error)
We have a few replicas and they're running on different nodes. I also ran an ubuntu pod in the same namespace and it can connect to the db using user/password
I swear that this was working yesterday haha
j
Yeah, there are two different flags for the TLS for SpiceDB itself and the datastore
j
So, the problem was the datasource URI we were using - we had
Copy code
sslmode=disable
and it seemed to be some kind of weird incompatibility between pgx and Cloud SQL, or maybe a database setting we've configured. The server was logging "connection refused" and the client was logging a timeout. Changing from:
Copy code
postgres://spicedb:.....@xxx.xxx.xxx.xxx/spicedb?sslmode=disable
to remove
Copy code
sslmode=disable
fixed it The server was logging:
Copy code
2023-08-24 19:12:30.364 UTC [289145]: [1-1] db=spicedb,user=spicedb LOG:  could not receive data from client: Connection reset by peer
j
glad you figured it out... i know pgx does rewrite some of the URI values, but idk about sslmode