Watch lag
# spicedb
a
Does anyone have some numbers on how much lag there is between a relationship write and a watch reply event for that write?
v
Which database are you using?
Postgres should pretty fast
a
Planning to use postgres, just looking for rough numbers before experimenting myself
Looks like it is polling and uses
Copy code
--watch-api-heartbeat duration                                    heartbeat time on the watch in the API. 0 means to default to the datastore's minimum. (default 1s)
for the poll time?
v
yeah, it will be controlled by the heartbeat, that's right
postgres tends to be as fast as the poll
and CRDB other factors come into play, like load in CRDB side, since it's all based on changefeeds which is some complex machine
wondering what you are planning to do with it and how much lag would be acceptable for your use case
a
Thanks and good q – answer is it depends. We're working through many use cases of different applications and trying to understand different replication options and their tradeoffs. For example, we are using a GCP-IAM-like model and we have the common requirement that both application and SpiceDB need to track what "Project" a given resource is in. (The application will use this as a filter, for example, so it is used on top of a lookup-resources query.) There are also other cases where access control is somewhat attribute-based and so the attributes have to be stored both in the system that owns the domain and into SpiceDB for access decisions. For most of these cases I think listening back to Watch should be fast enough (e.g. within single digit seconds). Some cases we had some pushback where folks wanted it to be effectively instant or close to (like their current, built-in access control is). For these cases I am considering using Debezium to replicate changes into SpiceDb.
(So app writes to DB, debezium gets the change ~immediately and writes it to Kafka, a listener gets notified of that and writes to SpiceDb)
Or we may find that seconds is enough 🙂 So we'll see. There are some other cases for Debezium, though, like when the transaction for the app has both data it needs in SpiceDB as well as its own data–so to make that reliably consistent it writes to own db, then use Debezium to replicate.
It did make me wonder if the approach Debezium uses (logical replication https://www.postgresql.org/docs/current/logical-replication-publication.html) could be used in Watch rather than polling.
j
it probably could
but that would require more infrastructure to run
v
> we have the common requirement that both application and SpiceDB need to track what "Project" a given resource is in. (The application will use this as a filter, for example, so it is used on top of a lookup-resources query.) This sounds a lot like this? https://github.com/authzed/spicedb/issues/1317. Are you basically using it as a means to scope down the searchable graph? Joey has opened recently https://github.com/authzed/spicedb/pull/1905, and he gave it some thought to how to implement 1317 with the foundation in that PR https://github.com/authzed/spicedb/issues/1317#issuecomment-2126033276 > There are also other cases where access control is somewhat attribute-based and so the attributes have to be stored both in the system that owns the domain and into SpiceDB for access decisions. Genuine question, why would the system that owns the domain want to replicate from SpiceDB? Isn't SpiceDB essentially a denormalization of the data in the main database? > For most of these cases I think listening back to Watch should be fast enough (e.g. within single digit seconds). Some cases we had some pushback where folks wanted it to be effectively instant or close to (like their current, built-in access control is). For these cases I am considering using Debezium to replicate changes into SpiceDb. I see, so what you are saying here is that SpiceDB is the source of truth for some data, and that's why Watch API lag becomes relevant, because the data is actually denormalized into the main domain database. Throughout my career I've always used the term "authorization-related" or "authorization-adjacent" data: all business data can be used for authorization decisions, and conversely there is rarely data used exclusively for authorization decisions (e.g. you'll eventualyl find some use-case that is not purely authorization related). For that reason I lean more towards using SpiceDB like you use an index: a domain specific enginee to efficiently compute authz decisions.
So this is to say Debezium (and generally any transaction log tailing strategy) works well with SpiceDB. The particular challenge associated with it is addressing scenarios where you need
read-your-writes
consistency. The client application won't have access to the zedtoken generated unless you feed the result from the write back into kafka again and that is written into the corresponding domain database row. Think scenarios like creating a new resource and redirecting the user to it. Another strategy we've been using internally is Durable Workflows (this is a good summary from one of the providers in the space: https://www.golem.cloud/post/the-emerging-landscape-of-durable-computing). This is akin to using Temporal, without having to run it. We've used successfully https://github.com/cschleiden/go-workflows internally to durably persist writes to the kube API server and SpiceDB. This addresses the issues with dual-writes, and avoids the complexity of running an external workflow orchestrator. It does not come for free though: it's a bit of a paradigm shift in how you code your business logic, with heavy inversion of control.
> Some cases we had some pushback where folks wanted it to be effectively instant or close to (like their current, built-in access control is). For these cases I am considering using Debezium to replicate changes into SpiceDb. Is this a matter of latency, or is it because they want transactional guarantees with the data being written? (e.g. they write to DB, the assume the next query the data is present)
a
Thanks so much for thoughtful responses – bit tied up currently but will digest ASAP
> Is this a matter of latency, or is it because they want transactional guarantees with the data being written? (e.g. they write to DB, the assume the next query the data is present) Good Q. I have to dig into this. I think we may have both but I'll get specifics...
> The client application won't have access to the zedtoken generated unless you feed the result from the write back into kafka again and that is written into the corresponding domain database row This is interesting... the watch response includes the zedtoken for that update, right...
j
yes
a
> This sounds a lot like this? https://github.com/authzed/spicedb/issues/1317. Are you basically using it as a means to scope down the searchable graph? > Joey has opened recently https://github.com/authzed/spicedb/pull/1905, and he gave it some thought to how to implement 1317 with the foundation in that PR https://github.com/authzed/spicedb/issues/1317#issuecomment-2126033276 Yes – this use case comes up for us often (using it as a means to scope down the searchable graph).
38 Views