Caveats or Resources, how to decide? SpiceDB #spicedb

Caveats or Resources, how to decide?

verdverm.com

09/06/2025, 11:17 PM

I'm trying to represent users, groups, and resources over - DIDs (users) - AtURI (content-addressable resource ids:

<did>/<space>/[nsid]/<rkey>

) [full schema](https://github.com/blebbit/atproto/blob/main/packages/pds/src/authz/spicedb/schema/atproto.zed) I have a resource type for each segment of the segments, with a parenting setup to support nesting/hierarchy, and each AtUri has an associate data record tied to it, except for

nsid

. They are only for structuring records and granting permissions, in OAuth scopes today, and content/methods tbd, that's what I'm working on. NSID are defined by any DID, it's a very dynamic list. There are

nsid

for queries and procedures that will never have records, but we still want to put permissions over them. Certain

nsid

are expected to have high numbers of records and storing the parenting relationship in spice, which seems inefficient? Are caveats something that can help me here? What if those caveats are large sets? The new way to specify OAuth scopes over the dynamic NSID is as a permission set https://github.com/bluesky-social/proposals/blob/main/0011-auth-scopes/README.md#permission-sets I image we need something similar for the custom roles that we want in the content permission system, and then have to reflect those within spice by making a number of calls, or a bulk input? (maybe one day even replace the OAuth permission setup and unify the two... #futurology)

yetitwo

09/07/2025, 2:25 AM

my general advice is that if something can be represented and stored in relationships, it should. relations are going to cache better and (generally) evaluate faster than equivalent logic expressed in caveats. what's the concern with inefficiency here?

verdverm.com

09/08/2025, 8:21 AM

Here's where I got to, decided to try caveats, I like them https://bsky.app/profile/verdverm.com/post/3lycq2yxhzs26

yetitwo

09/08/2025, 3:03 PM

right on

verdverm.com

09/08/2025, 9:00 PM

Is a schema that is 500+ lines pretty typical for real / large systems?

yetitwo

09/08/2025, 9:08 PM

yeah, that's been our experience. we've had to bump the max allowed size of a schema in validation a couple of times to reflect this: https://github.com/authzed/api/pull/77

verdverm.com

09/09/2025, 1:05 AM

Another question 1. I have a token (subject) with read/write permissions generally 2. I want to limit the permission graph based on a context, read-only is a simple example, one could imagine it could get more complex, effectively another custom role 3. What if I don't know the permission-subgraph ahead of time? (i.e. it's driven by external data / context, user defined custom roles) 4. Part of this is coming from Capability Based Authorization, and delegation of capabilities How should I model this? If I simplify to custom roles (less-dynamic), maybe I can model context limitations and delegation with a caveat on the roles access? I suppose with two inputs (user says def use this context, env is already restricted, we take the middle of the venn)... perhaps this is where more of the algebra comes in to spicedb?

verdverm.com

09/09/2025, 1:13 AM

I still have to wrap my head around banned users and other identities, how to bring these two together. Is there a way to be more succinct with the crud relations, and then again with the negations they need?

Copy code

definition superuser {} // PDS admin / moderation
definition anon {}
definition acct {}
definition oauth {}
definition apikey {}
definition svcacct {}
definition service {}  // appview, labeler, feedgen, ... needs a DID

partial negative {
  // various negating relations (should / do we need all of these here)
  relation blocked:   acct | service
  relation muted:     acct | service
  relation banned:    acct | service
  relation takendown: acct | service

  // is there a meta-permission here like
  permission negated = blocked | muted | banned | takendown
}

definition record {
  // space containment / nesting
  relation parent: space

  ...owned
  ...negative
  ...record_crud
  ...record_iam
}

partial record_crud {
  // Role CRUD relations
  relation record_deleter: superuser |
    acct    | acct    with nsid_allowed |
    oauth   | oauth   with nsid_allowed |
    apikey  | apikey  with nsid_allowed |
    svcacct | svcacct with nsid_allowed |
    service | service with nsid_allowed |
    space#member | space#member with nsid_allowed |
    group#member | group#member with nsid_allowed |
    role#member  | role#member  with nsid_allowed

    ...

  // Role CRUD permissions
  permission record_delete = owner         + record_deleter + parent->record_delete
  permission record_update = record_delete + record_updater + parent->record_update
  permission record_create = record_update + record_creator + parent->record_create
  permission record_list =   record_create + record_lister  + parent->record_list
  permission record_read =   record_list   + record_reader  + parent->record_read

verdverm.com

09/09/2025, 1:15 AM

Or is this part of why schemas get so long, and also is the computation really that different, or does the underlying algebra & graph have good algos so it matters less? It's hard to know if the schema I write is "good" both in terms of correctness and performance

yetitwo

09/09/2025, 3:45 PM

i would start with correctness, and then there are some heuristics that you can use to improve performance

yetitwo

09/09/2025, 3:45 PM

ime the simplest schema you can write that expresses the logic you want is usually pretty close to the most performant

yetitwo

09/09/2025, 3:46 PM

there are a couple of guidelines to follow and a couple of non-obvious constructions that can help

yetitwo

09/09/2025, 3:46 PM

one is that negation is expensive, because you need to fully materialize the set on both sides of a negation to determine whether the resulting set is non-empty

yetitwo

09/09/2025, 3:47 PM

whereas unions and intersections can both short-circuit

yetitwo

09/09/2025, 3:51 PM

another is that intersection is more expensive than a union or an arrow (generally), which means that phrasing boolean logic in terms of self-relations can be beneficial:

Copy code

definition resource {
  relation user: user
  relation active: user:*
  permission view = user & active
}
// becomes
definition resource {
  relation user: user
  relation active: resource
  permission view = active->user
}

and you'd write a relation from a resource to itself to make it active.

yetitwo

09/09/2025, 3:52 PM

we also wrote up some best practices a little while ago: https://authzed.com/docs/best-practices

yetitwo

09/09/2025, 3:52 PM

i'd be curious whether this doc is helpful for you or not

yetitwo

09/09/2025, 4:20 PM

i'm also not sure i entirely understand your use case - where would a SpiceDB instance be running? what data would it hold?

verdverm.com

09/09/2025, 5:02 PM

The use-case is private data in ATProtocol, where every user gets their own database, so it seems they should each get their own permission system (so they can migrate both together. The user's database (repo) is managed by a PDS, which can have a single user all the way to 500k (though could get to nMillion). So SpiceDB would be running with the PDS for most cases. For a large outfit like Bluesky, they may run SpicedDB for the 30M+ accounts they manage the PDS for (they run a gateway for oauth already, which is run by the PDS is self-hosting as well). This is a really clean explanation of the distribute architecture of ATProtocol: https://atproto.com/articles/atproto-for-distsys-engineers

verdverm.com

09/09/2025, 5:05 PM

More generally, we need to map the permission system onto the existing resources in the protocol, and enable ATProto apps to leverage this permission system instead of having to write their own. So I'm not writing a permission system for one application, but for a network of applications and users. Not sure how much this will help without context or my words, but I have this slide deck I'm working on: https://docs.google.com/presentation/d/1504zw9wtNuG4FvyZSTAsfMbwPXrFuvorfRWOS1mWN44/edit?usp=sharing

verdverm.com

09/09/2025, 6:08 PM

tl;dr, it is very close to Google Docs, each account gets a root space and IAM therein. Organizations will create an account and then assign users within the org's atproto account / spaces

verdverm.com

09/09/2025, 8:47 PM

Just started skimming and it looks like it's going to be really helpful, thanks for the link!

verdverm.com

09/09/2025, 8:48 PM

I was thinking about the negation stuff I have, and your comment that it is far more complex. I believe we can handle this outside the permission system, before we even ask any questions. We are already checking these things in other places anyway, and they are broader strokes in terms of access. Then I see it as one of the first / top recommendations!

verdverm.com

09/09/2025, 8:53 PM

Now I see "Prefer Relationships to Caveats" and I will have to think about what I've done. Fortunately I have setup a testbed for our schema and should be able to try out both methods. Making the NSIDs a type and relation comes with more complexity on our end (there is not actual data or record), they are a scoping/authority mechanism in a content addressable system. Maybe they can be a pseudo-resource like pseudo-relations?

yetitwo

09/09/2025, 9:38 PM

ah yeah, if there's no data or record that sounds more like an attribute-based system which is what caveats are intended to help implement

yetitwo

09/09/2025, 10:19 PM

generating ad-hoc relations sounds painful

verdverm.com

09/09/2025, 10:46 PM

it's more that we have this app defined NSID (reverse domain namespace id) that sits in the middle of the content addressing, than we do ad-hof relations. The permissions over them should be relatively simple / limited (at least at the protocol level) CRUD + IAM for content, custom functions over them live in the apps instead of the PDS, but we can still put permissions to invoke them (really just need one permission available)

verdverm.com

09/09/2025, 10:48 PM

The caveats align well with the oauth scopes permission sets (which are collections of these NSID

Copy code

//
// Caveats
//

// NSID scoping
caveat nsid_allowed(nsid string, allowed_nsid list<string>) {
  nsid in allowed_nsid
}

// Custom scoping for apps
caveat context_allowed(context string, allowed_contenxt list<string>) {
  context in allowed_context
}

// for special use-cases, not expected to be generally used
caveat time_frame(beg string, end string, mode string) {
  // tbd...
}

verdverm.com

09/09/2025, 10:49 PM

If a caveat exists, but is rarely used, then performance experienced should be near-equivalent as if it wasn't in the schema?

yetitwo

09/09/2025, 11:00 PM

yeah, if an evaluation path doesn't include a caveat it shouldn't be affected much (if at all) by the existence of caveats on other paths

verdverm.com

09/09/2025, 11:04 PM

what if it's set in the schema above, but never used in the relations or permission checks? I would imagine the query engine can regonize this and avoid calculations?

yetitwo

09/09/2025, 11:32 PM

i'm not sure i understand why it would be in the schema if it's not used in checks 🤔

verdverm.com

09/10/2025, 12:45 AM

I mean >99% of the checks, imagine a caveat that is rarely used, those 99 would not be impacted by the existence of the caveat?

yetitwo

09/10/2025, 1:52 AM

correct, yeah

yetitwo

09/10/2025, 1:52 AM

up to slightly more stuff in the database and a slightly larger schema to be held in memory and interpreted, both of which i would expect to be negligible

verdverm.com

09/10/2025, 10:08 PM

I'm inevitably creating a short-list of links from spicedb for people in atproto, will share with y'all when they are in a good place

verdverm.com

09/17/2025, 6:58 PM

I might have to go back to the resource version instead of caveats, I need to nest them (or more so records under them, like a thread in a channel) and that makes the dedicated resource with relations seem more appropriate in my mind

verdverm.com

09/20/2025, 10:08 PM

lol, I think I'm back to caveats... having a hard time modeling the nsids as a resource, probably because we don't know what nsid until request time...

yetitwo

09/21/2025, 3:13 PM

yeah that requirement would definitely push me towards caveats

yetitwo

09/21/2025, 3:14 PM

this also sounds like it might be a decent use case for contextual tuples: https://github.com/authzed/spicedb/issues/1398 it's something that we've gone back and forth on actually implementing because it seems difficult to do in a sane and safe way, but for a relation that's not actually known until request time it sounds about right

verdverm.com

09/21/2025, 3:56 PM

Interesting, I'll look into adding the atproto use case there for more context, or maybe I'll just start a new discussion

verdverm.com

09/21/2025, 4:12 PM

I'm not sure we don't know before the request, maybe it's more that we have a large unknown list of NSIDs that are out of our control, but we do imagine users limiting to a sublist (still large, 100+), which can then have several more orders of magnitude of objects on the other side.

Copy code

<space>
  - 100s <user> in 1+ <group>
  - 100s <nsid>
     - 1000s and beyond <records>

i.e. there is significant fan-out in the content tree, and also the records can refer to each other, which is probably a good relation to capture in Spice. Caveats seem like they will work for this, but I wonder about performance...

Copy code

caveat nsids(allowed list<string>, nsid string) {
      nsid in allowed  <- this appears to be O(n) instead of O(n_logn)?
    }

Though if it as a map... it could be O(1) and support both positive and negative associations with an NSID...? It just seems like with NSIDs in the schema as a resource, to give access, we need to write 100s of relations to the "virtual" NSID (they are not stored, they are part of content addressing), versuss one with the caveats on it

verdverm.com

09/21/2025, 6:49 PM

The latest NSID caveat I'm working with, is the big-o statement accruate?

Copy code

// NSID filtering (think oauth permission sets and check)
// map  vs list: while specifying is more cumbersome,
// O(1) vs O(N) runtime performance is compelling
caveat nsids(allowed map<bool>, nsid string) {
  allowed[nsid] || false
}

verdverm.com

09/21/2025, 6:55 PM

ugh... why is existence in a map linear while lookup is constant > has(e.f): Space is constant. > If e is a map, time is linear in size of e.

verdverm.com

09/21/2025, 7:00 PM

> In the boolean operators

&&

and

||

if any of their operands uniquely determines the result (false for

&&

and true for

||

) the other operand may or may not be evaluated, and if that evaluation produces a runtime error, it will be ignored. This is not what I'm seeing, is this a bug?

Copy code

11:59AM ERR terminated with errors error="rpc error: code = InvalidArgument desc = evaluation error for caveat nsids: no such key: bsky_blob"
ERROR: got  expected false

from: https://github.com/google/cel-spec/blob/master/doc/langdef.md#logical-operators

verdverm.com

09/21/2025, 11:21 PM

new latest version

Copy code

caveat nsids(allowed map<bool>, default bool, nsid string) {
    (nsid in allowed) ? allowed[nsid] : (default || false)
  }

yetitwo

09/29/2025, 2:55 PM

that's bizarre

yetitwo

09/29/2025, 2:56 PM

it might be? that's definitely surprising to me. what was the expression and what was the context provided?

verdverm.com

09/29/2025, 3:02 PM

They would be very simple, like

Copy code

["  caveat bryan", [_space, "record_viewer", _bryan,  #"nsids:{"default": false, "allowed": {"bsky_post":true,"bsky_like":true}}"#]],
      ["  caveat devin", [_space, "record_viewer", _devin,  #"nsids:{"allowed": {"bsky_post":true}}"#]],

verdverm.com

09/29/2025, 3:11 PM

It's unclear if the docs are inaccurate, Gemini DR was insistent that it is O(1) because it uses that underlying Go map, which seems to be correct if the right kind of map is created (?) https://github.com/google/cel-go/blob/master/common/types/map.go#L587

verdverm.com

09/29/2025, 3:16 PM

fwiw, this is what Gemini said about the non-short-circuit to falsy for errors in CEL ... also that doing so could lead to unintended access in a permissions system, which I do think is a valid design point the CEL authors probably had in mind (?) https://cdn.discordapp.com/attachments/1414026804867371079/1422240490215964783/Screenshot_2025-09-29_at_8.13.39_AM.png?ex=68dbf42c&is=68daa2ac&hm=054bcf4ea086229b8b2f9ca013afc029cb0d9334a89357811f500d5b03820bcb&

yetitwo

09/29/2025, 3:17 PM

ah hm. that makes some sense, though i have thoughts about map access returning errors.

yetitwo

09/29/2025, 3:17 PM

"doing so" meaning short-circuiting?

verdverm.com

09/29/2025, 3:37 PM

yea, if you short circuit to false on error, i.e. if this was a block list rather than an allow list

yetitwo

09/29/2025, 3:37 PM

hmmm yeah

verdverm.com

09/29/2025, 3:38 PM

I'm actually happier with the final caveat, it supports more use case without being overly complex (defaulting to allow or block depending on the app or user)

yetitwo

09/29/2025, 3:39 PM

🙌

Joey

09/29/2025, 3:41 PM

@verdverm.com I explicitly had errors treated as such to prevent "hidden" issues

5 Views

Previous Next