we didn't have anything set before, presumably the defaults were unhelpful, but slightly less than 30 seconds sounds like a good choice then.
Could have been LB cutting that put the connection into a bad state. all the requests we could see making it to the ingress were on the order of 5-50ms though.
Mostly looks like a Python gRPC issue, where where the library jsut doesn't recover if the connection gets into a bad state. 🐍 🐍
Thanks again, if you do have any insight into what other grpc server-side settings are in place, they would be appreciated.