Hello Authzed people!
# spicedb
w
Hello Authzed people! We've had something fun happening on our SpiceDB deployment yesterday: a random SIGSEGV 🥲 - spicedb v1.25.0 - This was happening at pod start. The pod kept restarting for exactly an hour, then it solved itself. - The first instance of the crash was at pod start (probably k8s rescheduling a pod), not a running instance - After a few restarts, k8s scheduled a different pod (on a different node): it had the same thing happen (so it's not a pod or node-specific problem) - Started at 20:48 UTC, solved itself at 21:49 (5min backoff, previous failed restart was 21:44) - When it solved itself, it happened to be just when k8s rescheduled the workload on a third pod. Could be a coincidence. - This other running pods were fine (although maybe restarting them at this time would have made them go loopy too) - This was not related to any deployment, we haven't deployed anything in a while. We've been running 1.25.0 for a while too. Never had that issue. - Doesn't seem correlated to any weird traffic, this is rather a low-traffic time for us (it was crashing at startup before any traffic anyway) Any idea what this could have been? has this been seen before? https://cdn.discordapp.com/attachments/844600078948630559/1214870533242494976/image.png?ex=65faaf5a&is=65e83a5a&hm=2610599845de5425fbcfd64541f7590fe7b4deae45522f732cdd9ba95c6bd746&
This screenshot is all the logs we were getting from a pod spinning up to its crash, then it repeats
v
Hey William, if you backscroll in Discord, you can see this was reportes and fixed yesterday. It happened because we pushed a release with a naming scheme that caused the version check on startup panic. It's now fixed by pushing a new release and we opened PR to address the panic.
w
My bad, didn't assume that this might have been raised (this class of bug didn't occur to me) Thank you! (your last message was probably meant for another thread)
v
ah shoot yeah, def for another thread, thank you!