hey there
# spicedb
f
hey there I am currently trying to implement a search index for auth checks for listing queries. Currently this is implemented with polling by using
LookupResources
on a self hosted instance in k8s using the spicedb-operator. There is not a high number of tuples (< 200k), and the expected outcome tuples for a single request are also relatively small (< 50k). Now i already gave the underlying db (postgres) a lot of resources, as well as the spicedb pods as well. I am experiencing either: - Really long response times (~ 30s-1min) - Errors: 4 DEADLINE_EXCEEDED Is there any way to configure spicedb to increase the deadline? List-query permission checking is an absolute must for us, so w/o it and w/o materialize being public (yet) I am not sure about other options :/ PS: As the underlying db is GCP cloudsql, i have some query insights about whats happening. So although i already gave the postgres 4 cpus and 16gigs of ram, it spikes to 100% cpu usage when requesting that data. It seems a bit fishy to me that this already leads to such high usage
v
Are you using optional_limits in your
LookupResources
queries?
The paper describes basically using
Expand
API for what you are trying to achieve. @ecordell has put some thought to it.
f
yes i tried 3 things (nodejs user here) - promise based api - regular "streaming" api - limits w/ pagination (promise based) limits worked, but was even slower (> 2-3mins total response time)
v
well you have to use limits. The lack of limits is a backward compatibility guarnatee we left because it was the first design of the API, but it can cause the server to get OOMKilled because it has to buffer all elements in memory before streaming them
how many elements are you LR requests returning?
f
just did some spot-checks, roughly 20-30k
v
yeah, so that's going to take a bit to compute, but 2m-3m seems like a lot, it may be missing indexes. Have you looked into the database profiler?
A user identified https://github.com/authzed/spicedb/issues/1687, and I wonder if it could be affecting other DBs too
f
yep, so it does compute individual requests fast'ish, most load-causing query is something like this
Copy code
SELECT
  namespace,
  object_id,
  relation,
  userset_namespace,
  userset_object_id,
  userset_relation,
  caveat_name,
  caveat_context
FROM
  relation_tuple
WHERE
  pg_visible_in_snapshot(created_xid,
    $1) = $2
  AND pg_visible_in_snapshot(deleted_xid,
    $3) = $4
  AND namespace = $5
  AND relation = $6
  AND object_id IN ($7, <....>) LIMIT $107
It executes itself fast (6ms) but is called a ridiculous amount (600k times roughly)
v
right. It really depends on your schema, chances are there is an opportunity to optimize how your schema is traversed.
To me it feels like something your schema makes every tuple reachable
f
sounds reasonable! the schema is quite complex as it does something similar as in the playground with google cloud permissions
v
have you done
zed permission check --explain
on the same LR path you are using? What's the SpiceDB version?
you could use
zed backup create
and
zed backup redact
and sends us a dump of your schema, but being totally transparent we are swamped in work and it be best effort
f
> have you done zed permission check --explain on the same LR path you are using? didnt do that, can try now > version latest 1.30.1
> but being totally transparent we are swamped in work and it be best effort no worries, I have a call scheduled for next week to discuss further options. we are looking into hosting as well, don't really want to manage this myself 🙂
regarding the permission check explain, should it output something? I am getting for a sample permission
Copy code
9:55AM INF debugging requested on check
true
9:55AM WRN No debuging information returned for the check
v
could you upgrade our zed version?
19 Views