Does anyone know if there is a way to SpiceDB #spicedb

Does anyone know if there is a way to

tweeks

08/06/2024, 8:18 PM

Does anyone know if there is a way to increase the

grpc_health_probe

timeout value? We hit an issue where we had some heavy load that got us into a restart loop due to the healthcheck timing out. We're increasing our base number of replicas too but thought it would be nice to make this a little longer to prevent it from restarting. We're using the spicedb-operator to deploy.

Copy code

timeout: failed to connect service "localhost:50051" within 1s
  Warning  Unhealthy  55m (x6 over 12d)  kubelet  Readiness probe failed: command "grpc_health_probe -v -addr=localhost:50051" timed out
  Warning  Unhealthy  55m                kubelet  Readiness probe failed: parsed options:

tweeks

08/06/2024, 8:25 PM

One other odd thing, the error seemed to indicate the timeout is 1s but when I describe the pod I see 5s

Copy code

Liveness:       exec [grpc_health_probe -v -addr=localhost:50051] delay=60s timeout=5s period=10s #success=1 #failure=5
    Readiness:      exec [grpc_health_probe -v -addr=localhost:50051] delay=0s timeout=5s period=10s #success=1 #failure=5

yetitwo

08/06/2024, 8:37 PM

that's odd... i think this is where that configuration is coming from: https://github.com/authzed/spicedb-operator/blob/24d60c62cf989290d41416d8859dccfb5e073157/pkg/config/config.go#L788-L795

yetitwo

08/06/2024, 8:37 PM

so i'm not sure where it's getting the 1s

yetitwo

08/06/2024, 8:40 PM

it looks like there's now a way to do it natively, so that utility might not even be necessary: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-grpc-liveness-probe

yetitwo

08/06/2024, 8:41 PM

looks like the utility's default is 1s: https://github.com/grpc-ecosystem/grpc-health-probe/blob/master/main.go#L82-L84

yetitwo

08/06/2024, 8:41 PM

so i think it's probably something where the default of the utility is being used and the additional configuration isn't actually applied

yetitwo

08/06/2024, 8:53 PM

i asked and the 5s is a timeout of the call to the command used in the health probe rather than something that's plumbed through to the probe itself, so the probe is still using the 1s default for the

grpc_health_probe

command

yetitwo

08/06/2024, 8:54 PM

the probe command isn't currently configurable: https://github.com/authzed/spicedb-operator/blob/24d60c62cf989290d41416d8859dccfb5e073157/pkg/config/config.go#L750

tweeks

08/06/2024, 8:58 PM

Ok got it, thanks for diving in!

yetitwo

08/06/2024, 9:11 PM

tracking issue here: https://github.com/authzed/spicedb-operator/issues/333

tweeks

08/06/2024, 9:39 PM

Oh didn’t recognize the handle lol, I was wondering if you had started

yetitwo

08/06/2024, 9:39 PM

hahaha yeah that was yesterday ^.^

58 Views

Previous Next