Question Idea about data enrichment SpiceDB #spicedb

Question Idea about data enrichment

dguhr84

06/02/2023, 7:03 AM

Question / Idea about data "enrichment" capabilities / good practice: Is there a recommended way, or anything planned, to add some kind of "non-authz-state-related metadata"? Context: We're saving the oauth/oidc

sub

claim as unique user ID in spiceDB relations when seeding new organizations. We want to use spiceDB at best also as source of truth for lookups such as "give me all the users that can be assigned here", "give me the already assigned users" in a licensing/entitlement context. Basically wrapped Lookup-calls. Now it would be lovely to have the additional fields for the frontend, such as "name, email, ..." also returned from these calls without the added complexity of asking another queue/system. Any thoughts would be helpful 🙂 Idea:

zed permission lookup-subjects org:myOrg licensed user

returns as of now a list of

sub

UUIDs previously seeded. What I think of is sth around the lines of

zed permission lookup-subjects org:myOrg licensed user --enrich={table="userdata", fields=surname,lastname} --enrich-onerror=<ignore/fail/...

or sth. which then returns also the defined fields in the table.. Well, this is not at all thought out well, and assumes a lot, so happy to hear how this data enrichment is handled by others.

vroldanbet

06/02/2023, 8:35 AM

Hey @dguhr84 👋 We've discussed internally a "Metadata API" but haven't actually created a public proposal about it. The idea would be to store this metadata alongside the relationship (similar to how we do caveat context), but with the idea of retrieving it back using

ReadRelationship

API. We've seen interest over time for something like this, but it hasn't been in our priority list. With such a design in mind, Lookups would have some complications because there are multiple paths to conclude a subject has access to a resource, and if the metadata is part of the relationship, you may need to duplicate that metadata everywhere that subject has a direct relationship. If the application has awareness of where those relationships live, would a Lookup + subsequent ReadRelationships work for y'all?

Joey

06/02/2023, 4:49 PM

metadata issue for reference: https://github.com/authzed/spicedb/issues/966

dguhr84

06/05/2023, 7:06 AM

Thanks for the answer & link 🙂 I need some time to think about it and read through. Currently I don't think(!) it'd solve our problem completely, as we want to be able to codify sth like "get me 30 assignable users whose surnames start with "R" starting from 200 users down", where "assignable" is a permission of

active_user - assigned

of an org, in the context of licenses. That calls results should be enriched by other data such as name, email etc. So the "best" solution would be to query permissions and sort/filter on metadata in the same call I guess. We are currently thinking of another call for batch-enriching the data, or of using a materialised view/read model for fetching that additional data (eventual consistent). All feels a bit suboptimal/weird without having the concept of a real "entity". Then again, this is not the critical "check" path with its performance considerations. Only loose thoughts so-far for sure 🙂

vroldanbet

06/05/2023, 8:48 AM

This feels like a task for

LookupWatch

proposal. You'd ingest the changes to

LookupResources

over a specific permission into your own database, and then sort and filter as you see fit. New LR implementation in upcoming 1.22 comes with fully streaming implementation, so if you accepted the order established by the system, you could filter as resources come in, but you'd still require associated metadata to filter for, or load it from your database as the relationships are streamed. "From 200 users down" sounds like offset-based pagination, and as a consequence, if you want to say "starting from 1M users down", the system would have to load the first 999,999 users from the database. I have concerns around opening arbitrary querying semantics into the underlying database, the indices are already fine tuned for the limited access patterns the API offers. And it can quickly spiral into exposing full-blown SQL as every customer will have different requirements.

dguhr84

06/05/2023, 11:53 AM

about the "starting from the 200th user." it shouldn'T mean offset-based stuff. we'd surely want to leverage the pointer/cursor based pagination 🙂 I totally understand the concerns. perhaps it's just something that should be enriched not by spiceDBs persistence, but another one. Somehow it would be nice, though, to get all of it with one call 🙂

vroldanbet

06/05/2023, 6:10 PM

yeah, I get the convenience, for sure! It's a tricky balancing tho

2 Views

Previous Next