https://authzed.com logo
Title
r

RobertM

12/14/2022, 3:45 AM
Hello 👋 I'm using operator for deploying a cluster and found a problem with running migrations. Long story short - a pod that is running migrations seems to be stuck.
I can see in it's logs that it was running some migrations
{"level":"info","new level":"debug","time":"2022-12-14T03:08:46Z","message":"set log level"}
{"level":"info","new provider":"none","time":"2022-12-14T03:08:46Z","message":"set tracing provider"}
{"level":"warn","this-version":"v1.11.0","error":"Get \"https://api.github.com/repos/authzed/spicedb/releases/latest\": read tcp 10.13.95.63:55642->20.27.177.116:443: read: connection reset by peer","time":"2022-12-14T03:08:46Z","message":"could not perform version checking; if this problem persists or to skip this check, add --skip-release-check=true"}
{"level":"info","time":"2022-12-14T03:08:46Z","message":"migrating postgres datastore"}
{"level":"info","targetRevision":"head","time":"2022-12-14T03:08:46Z","message":"running migrations"}
{"level":"info","from":"","to":"1eaeba4b8a73","time":"2022-12-14T03:08:46Z","message":"migrating"}
{"level":"info","from":"1eaeba4b8a73","to":"add-reverse-index","time":"2022-12-14T03:08:46Z","message":"migrating"}
{"level":"info","from":"add-reverse-index","to":"add-unique-living-ns","time":"2022-12-14T03:08:46Z","message":"migrating"}
{"level":"info","from":"add-unique-living-ns","to":"add-transaction-timestamp-index","time":"2022-12-14T03:08:46Z","message":"migrating"}
{"level":"info","from":"add-transaction-timestamp-index","to":"change-transaction-timestamp-default","time":"2022-12-14T03:08:46Z","message":"migrating"}
{"level":"info","from":"change-transaction-timestamp-default","to":"add-gc-index","time":"2022-12-14T03:08:46Z","message":"migrating"}
{"level":"info","from":"add-gc-index","to":"add-unique-datastore-id","time":"2022-12-14T03:08:46Z","message":"migrating"}
{"level":"info","from":"add-unique-datastore-id","to":"add-ns-config-id","time":"2022-12-14T03:08:46Z","message":"migrating"}
But it just stays like that forever
Also I can see in postgres that tables and indices were created
Any idea what could be the problem?
j

Joey

12/14/2022, 3:50 AM
@jzelinskie
r

RobertM

12/14/2022, 6:11 AM
I'm running 13.5 so it could be the reason, I'm going to update it and give it another try!
Hi! So turned out that the issue was caused by Istio. Injected envoy sidecar kept the job infinitely alive. All migrations were running correctly. Would you guys be interested in documenting using operator with Istio? Or potential workarounds in this case? Links related https://discuss.istio.io/t/best-practices-for-jobs/4968 https://discuss.istio.io/t/best-practices-for-jobs/4968/3
Another problem created by Istio was: I had to add ServiceEntry to enable dispatching between spicedb instances. I think it might depend on your Istio configuration if this problem will occur, but could be also documented (maybe?)
j

Joey

01/06/2023, 5:39 AM
we'd definitely be happy for contributions of docs for this
r

RobertM

01/06/2023, 5:47 AM
I'm currently in the process of getting permission for contribution etc so will see how I can help!
Btw, what do you think about allowing to override CMD in migration pod? I imagine it executes something like
spicedb migrate head
currently. If we could
curl -sf -XPOST http://127.0.0.1:15020/quitquitquit
after that it would exit envoy sidecar. Allowing to specify CMD or passing some script file would be helpful in such situations (though I know this env has it's limitations - I'm not sure if curl is even supported)
j

jzelinskie

01/06/2023, 7:44 PM
Maybe it'd make more sense to just support detecting Istio and making that call for folks, so that they don't bang their head against the wall until they find the docs