Hi! I have a question about the SpiceDB #spicedb

Hi! I have a question about the

torben_26072

02/20/2024, 11:53 AM

Hi! I have a question about the performance of LookupResources. We're currently designing a new system with different microservices and had the idea to model and store the relations between objects only in SpiceDB. That way none of the services would have to have knowledge about the relations or permission logic. That would mean if someone opens a web interface and we want to show him every resource of a type (e.g. "document") he has access to and can edit we would have to query all of those objects from SpiceDB before showing them. Would that be possible or would LookupResources be a performance bottleneck if we have thousands (or millions) of users? (I have to admit that I don't have any experience with gRPC streaming.)

vroldanbet

02/20/2024, 12:14 PM

LookupResources

could be a bottleneck, depending on how it's used. It supports cursoring to help retrieving very large number of resources without buffering it in memory first, but please do note that it does not guarantee ordering and there may be duplicates found in various branches. If used without cursoring, it may exhaust server resources, as it has to compute the whole result before start streaming it.

torben_26072

02/20/2024, 12:45 PM

Thank you for your answer! To use cursoring you would have to define the "optional_limit" and then you could continue with the "after_result_cursor" from the result until you have the complete list of resources? If a user had let's say 100 or 1000 documents, would using cursoring for retrieving their IDs via LookupResources make sense? The set itself should not need a huge amount of memory, because it's just some IDs with some additional infos?

torben_26072

02/20/2024, 3:23 PM

I did some testing with this schema and SpiceDB running in Docker: https://play.authzed.com/s/NVTEhOOCjAPS/schema I assigned 1,000 "knowledgebases" with relation "kb_org" to "org1". And I assigned "user1" with relation "member" to "org1". LookupResources for type "knowledgebase", permission "view" and subject "user1" took ~50 ms. Then I created 99,000 "knowledgebases" assigned to "org2". And did LookupResources for "user1" (who is not assigned to "org2") again and it took ~70 ms. I did the same test with 999,000 "knowledgebases" assigned to "org2" which increased the duration of the query (which is still returning the same 1,000 "knowledgebases") to ~350 ms. Is this the expected behavior?

vroldanbet

02/20/2024, 3:24 PM

>To use cursoring you would have to define the "optional_limit" and then you could continue with the "after_result_cursor" from the result until you have the complete list of resources? correct >If a user had let's say 100 or 1000 documents, would using cursoring for retrieving their IDs via LookupResources make sense? The set itself should not need a huge amount of memory, because it's just some IDs with some additional infos? 1K elements should be fine without cursors, as long as you are able to guarantee that from the application side. The moment that size grows, it would be eagerly loading a bunch of stuff in memory. And unfortunately there is a bunch of tracking happening in runtime so it requires some memory.

vroldanbet

02/20/2024, 3:28 PM

>Is this the expected behavior? are you using the memory datastore?

torben_26072

02/20/2024, 3:42 PM

I have to admit I'm not sure what I'm using. I'm using the Docker image pulled with

docker pull authzed/spicedb

and started with

Copy code

docker run \
    --name spicedb-testing \
    -p 50051:50051 \
    authzed/spicedb \
    serve-testing

I did not change any settings. The memory store seems likely, because after stopping the Docker container the data is gone. Would using Postgres/CockroachDB improve the performance when a lot of relations are defined?

vroldanbet

02/20/2024, 3:44 PM

If you are using

serve-testing

then you are using the memory datastore, which is not optimized for production workloads. This isn't to say there isn't an issue here, but the memory datastore wouldn't be the reference implementation for performance on large datasets. Postgres is likely a good place to start.

torben_26072

02/20/2024, 4:15 PM

I'll take a look at the performance with Postgres 👍

torben_26072

02/20/2024, 6:11 PM

Performance with Postgres is a lot better. > I did the same test with 999,000 "knowledgebases" assigned to "org2" which increased the duration of the query (which is still returning the same 1,000 "knowledgebases") to ~350 ms. With Postgres this is now down to ~50-70 ms. 👍

vroldanbet

02/20/2024, 6:33 PM

👍

24 Views

Previous Next