https://authzed.com logo
Title
m

Mark McLaughlin

02/10/2023, 9:57 AM
Hi spicedb experts. 🙂 I'm doing an investigation into how data should be mastered that may be used for authz decisions in an organisation's IT landscape. Data could be a) in pre-existing systems (e.g. SSO for users and groups), and be need to be synced to spicedb, b) mastered intentionally in a new service, because the data is rich and only partly used for authz (sync needed here as well) or c) mastered directly in spicedb (where life should be easy). Do you have any guidelines for how teams should make decisions about where data should be mastered in particular cases? Have you done any analysis of the consistency issues involved when data is distributed that I could look at?
v

vroldanbet

02/13/2023, 11:17 AM
Hey @Mark McLaughlin 👋🏻 saw your message on Friday - it's a good question! Also one that I cannot give a satisfying simple answer, but let's try 😄 General guideline would be to have SpiceDB be the source of truth all data necessary to perform authorization decisions, unless that data has uses other than authorization. Now that's the happy case, but the a perhaps more common scenario is when application data is used for both business logic and authorization logic. And more generally in a service architecture, multiple services own the data for their corresponding bounded contexts. Here you enter "data replication" and "multiple transaction boundary" territory, for which there is a lot written, but I always like to start suggesting the great article from Red Hat on the topic: https://developers.redhat.com/articles/2021/09/21/distributed-transaction-patterns-microservices-compared I particularly I'm not aware we've published a formal analysis around consistency implications when data is replicated to SpiceDB - I cannot speak for the rest of the team since I joined relatively recently - but have an understanding that depending on the strategy you take, you'd need to make some tradeoffs. Unless you are doing something like XATransactions (which SpiceDB does not support right now, btw), you'll be loosing isolation guarantees. When data is replicated using strategies like transaction-log tailing, Event Sourcing / CQRS, etc, data will be eventually consistent, and you'll have the challenge of how to "send back" the zedtoken to the application. But to be fair, this is all not really specific to SpiceDB (sans zedtoken part) , but the tax you pay once you enter service-oriented architectures. In that sense, SpiceDB will still behave with the strong consistency guarantees advertised, but the challenge is the tradeoffs made when data is replicated. The answer is unfortunately an unsatisfying "it depends on your architecture and platform guidelines".
If somehow you have a message bus like Kakfa around, I'd say look into Debezium. If you already adopted Event Sourcing, then barrier of entry will probably be lower. If you cannot live with
minimize_latency
request consistency and you need stronger bounded-consistency, then look into emitting an event with the zedtoken so that the client application can write it back to the database, or consider using
fully_consistent
.
m

Mark McLaughlin

02/13/2023, 11:28 AM
Thanks @vroldanbet for your comprehensive answer! I think what you have said gels with our current understanding also. If I have any further questions, I will follow up here.