LookupResources with cursors returns duplicates
# spicedb
p
Greetings, it's my frist time posting here. I've been revieweing AuthZed's SpiceDB's documentation and been running some tests to see if we can utilize SpiceDB for our product's auth needs and I'm sooo impressed by everything. By performance, by the documentation, by the code. I've been digging in the code a little bit...so thank YOU! I did find something interesting, I'm not sure if it's an expected behavior or it is an actual issue. Here's my playground: https://play.authzed.com/s/oZ-wK3bcvqCY/schema. Issue is that when I use
LookupResources
grpc procedure. If I don't use any limits, I get all the objects, no problem. When I use a limit that is less than the total number of expected response items, everything works okay as well. In the call below, I expect 4 items and if I use limit of 1, 2 or 3 everything works okay but if I use a limit of 800. I get 6 items and two of them are duplicated. I used
1.25.0-rc4
version. Here's the request I used: > { > "consistency": { > "minimize_latency": true > }, > "context": { > "fields": [ > { > "key": "account_id", > "value": { > "string_value": "account1" > } > } > ] > }, > "optional_limit": 800, > "permission": "view", > "resource_object_type": "filestore/file", > "subject": { > "object": { > "object_id": "multi-account-user", > "object_type": "iam/user" > } > } > }
v
👋 duplicates are possible with
LookupResources
, as it has to explore all branches, and the same tuple may be reached through different paths. When you don't set a limit, the original behaviour of the API is retained, that is, the behaviour before cursors were introduced, so it buffers everything in memory, dedupes it, and returns it. While this is likely what you are looking for, it won't scale as the number of elements returned increases, to the point it will OOMKill your instance. For that reason cursors were introduced, and unfortunately the implementation is incapable of knowing which elements have been already seeing in previous pages, therefore duplicates are possible. Please also note that items out of order are also possible.
p
I see. Thanks for the detailed explanation. I'm guessing that's why there's a limit of 1,000 on the
optional_limit
as well, to avoid having to use too much memory.
a
Good to know thank you
v
correct, to set an upper boundary to the memory required to do the API call. You should adjust it depending on the resources of your deployed cluster.