Hey guys, has anyone tried out `enable-
# spicedb
t
Hey guys, has anyone tried out
enable-experimental-watchable-schema-cache
with Postgres as the datastore? What prometheus metrics should I look out for? Thanks 🙂
v
You should check these:
Copy code
spicedb_datastore_watching_schema_cache_definitions_read_cached_total
spicedb_datastore_watching_schema_cache_definitions_read_total
spicedb_datastore_watching_schema_cache_tracked_revision
spicedb_datastore_watching_schema_cache_namespaces_fallback_mode
spicedb_datastore_watching_schema_cache_caveats_fallback_mode
spicedb_datastore_spanner_schema_watch_retries_bucket
t
Thanks a lot! I noticed
spicedb_datastore_watching_schema_cache_namespaces_fallback_mode
was set to 1 almost from the beginning. I looked at the spicedb logs and found this:
Copy code
BUG: *errors.errorString received out of order insertion for definition tenant
/home/runner/actions-runner/_work/spicedb/spicedb/pkg/spiceerrors/bug.go:33 (0x11bec85)
/home/runner/actions-runner/_work/spicedb/spicedb/internal/datastore/proxy/schemacaching/watchingcache.go:509 (0x1c4c317)
/home/runner/actions-runner/_work/spicedb/spicedb/internal/datastore/proxy/schemacaching/watchingcache.go:280 (0x1c4abc5)
/opt/hostedtoolcache/go/1.23.1/x64/src/runtime/asm_amd64.s:1700 (0x47e021)
Which is coming from here https://github.com/authzed/spicedb/blob/abdb041ba17750f1c8edc681d038518b768f37d6/internal/datastore/proxy/schemacaching/watchingcache.go#L517 However I don't fully understand why does happen. It's trying to add a namespace definition at a revision time before the latest, but why? If it's in fallback mode since this error occurred, does that mean the cached schema from watch has not been used all this time? I'm using v1.38.0 with Postgres
I've restarted with log level set to debug. I'll see if it happens again
v
hmm, haven't seen that one before. Can you open an issue? What SpiceDB version are you using? Are you using any specific Postgres version, or managed service? Are you using read-replicas?
ah sorry, saw you mentioning 1.38
From the code, it would seem as if the watch API is generating out-of-order events, which is unexpected
did you actually enable track_commit_timestamp in postgres?
t
We're using the Azure managed instance of Postgres at version 16.3. The server parameters have
track_commit_timestamp
as enabled. We're not using read replicas tho. Since re-deploying, I haven't seen the
received out of order insertion
log. I'll open an issue on Github (never opened one 😄 )
v
thanks for reporting 🙏