Another question: do I still need to set SpiceDB #spicedb

Another question: do I still need to set

williamdclt

04/30/2024, 12:28 PM

Another question: do I still need to set

plan_cache_mode = force_custom_plan

myself in postgres? How does SpiceDB handle that now?

vroldanbet

04/30/2024, 12:40 PM

You don't have to set it anymore, it's set by default, unless you specify

plan_cache_mode

in the DSN https://github.com/authzed/spicedb/blob/main/internal/datastore/postgres/postgres.go#L615-L634

williamdclt

04/30/2024, 12:58 PM

I still find it sad that the generic plans are disabled for all queries, when only a single query had an issue. For example, the following query is by far the most commonly queries (~2500 queries/s in my system), and from a quick test it looks like a generic plan speeds it up by >3x (most of the time is spent in ). That really adds up, and I have some hope that it might free up a bunch of CPU which is my bottleneck ATM

Copy code

sql
SELECT namespace, object_id, relation, userset_namespace, userset_object_id, userset_relation, caveat_name, caveat_context FROM relation_tuple WHERE pg_visible_in_snapshot(created_xid, $1) = $2 AND pg_visible_in_snapshot(deleted_xid, $3) = $4 AND namespace = $5 AND relation = $6 AND object_id IN ($7) AND ((userset_namespace = $8 AND userset_object_id IN ($9) AND userset_relation = $10)) LIMIT 9223372036854775807

See the "planning time" in these 2 EXPLAIN ANALYZE:

williamdclt

04/30/2024, 12:58 PM

With `plan_cache_mode = force_custom_plan`:

Copy code

sql
 Limit  (cost=0.69..8.72 rows=1 width=138) (actual time=0.033..0.034 rows=0 loops=1)
   ->  Index Scan using uq_relation_tuple_living_xid on relation_tuple  (cost=0.69..8.72 rows=1 width=138) (actual time=0.033..0.033 rows=0 loops=1)
         Index Cond: (((namespace)::text = 'care_recipient'::text) AND ((object_id)::text = '(268ea094-92e4-11eb-a25b-067ff236ecb9,af6f7d21-eb5d-42a0-8b9e-cc55dddfeee0,99c8c11d-f6d2-471d-845b-ed2dbc21e822,90ae53f5-ad48-4929-9a12-b4a6ed8a3d80,9f7a7d6d-9b6a-4b2e-8d33-dfbea8e4c59f)'::text) AND ((relation)::text = 'caregiver'::text) AND ((userset_namespace)::text = 'user_agency_role'::text) AND ((userset_object_id)::text = '(e3108ff2-4041-4287-972e-9fb5ca6d8cc7__cd79fa38-b6e8-4452-a14b-7f67141d0489,e3108ff2-4041-4287-972e-9fb5ca6d8cc7__cd79fa38-b6e8-4452-a14b-7f67141d0488,e3108ff2-4041-4287-972e-9fb5ca6d8cc7__cd79fa38-b6e8-4452-a14b-7f67141d0481)'::text) AND ((userset_relation)::text = '...'::text))
         Filter: (pg_visible_in_snapshot(created_xid, '795:799:795,797'::pg_snapshot) AND (NOT pg_visible_in_snapshot(deleted_xid, '795:799:795,797'::pg_snapshot)))
 Planning Time: 0.204 ms
 Execution Time: 0.057 ms
(6 rows)

With `plan_cache_mode = auto`:

Copy code

sql
 Limit  (cost=0.56..8.60 rows=1 width=138) (actual time=0.040..0.040 rows=0 loops=1)
   ->  Index Scan using ix_relation_tuple_by_subject on relation_tuple  (cost=0.56..8.60 rows=1 width=138) (actual time=0.039..0.039 rows=0 loops=1)
         Index Cond: (((userset_object_id)::text = $9) AND ((userset_namespace)::text = $8) AND ((userset_relation)::text = $10) AND ((namespace)::text = $5) AND ((relation)::text = $6))
         Filter: (((object_id)::text = $7) AND (pg_visible_in_snapshot(created_xid, $1) = $2) AND (pg_visible_in_snapshot(deleted_xid, $3) = $4))
 Planning Time: 0.055 ms
 Execution Time: 0.079 ms
(6 rows)

williamdclt

04/30/2024, 12:58 PM

interestingly they don't use the same index actually 🤔

vroldanbet

04/30/2024, 1:31 PM

It's tricky to get all of the access patterns fully optimized, given all the variants SpiceDB exercises. I'm sure there are opportunities to make things better, but what works for your database may not work in another database instance, depending on the dataset.

williamdclt

04/30/2024, 1:34 PM

I'd say that PG is likely to know better how to optimise a given query? It obviously wasn't true for the specific bug we hit, but it was the exception rather than the rule

vroldanbet

04/30/2024, 1:42 PM

You can enable the generic query plan via the URI right?

williamdclt

04/30/2024, 2:09 PM

But then that would also enable generic query plans for the query that we know to not handle these well (the

SELECT MIN(id), MAX(id) from relation_tuple_transaction ...

), and basically kill my spicedb

vroldanbet

04/30/2024, 2:21 PM

Ie need to check if that query is still being used in the codebase

Joey

04/30/2024, 3:10 PM

you should be able to override the param and just check to see if it still impacts

vroldanbet

04/30/2024, 3:27 PM

the query is no longer SELECT MIN, that predates the redesign of the MVCC in postgres: https://github.com/authzed/spicedb/blob/6390b785228c2d334fc1fdd7271e1fd24d6874e3/internal/datastore/postgres/revisions.go#L35-L43 which version are you using?

williamdclt

04/30/2024, 4:24 PM

I'm on v1.27.0, will soon upgrade. It's totally possible that this plan_cache_mode discussion isn't relevant anymore! I'll try explicitely setting it to auto

Joey

04/30/2024, 4:26 PM

yeah, we'd be quite curious to know

Joey

04/30/2024, 4:27 PM

if we can remove it, wonderful

Joey

04/30/2024, 4:27 PM

if not, we can discuss making it per query

williamdclt

05/01/2024, 9:30 AM

Hm, I'm not sure how to set it actually! I'm using a postgresql URI (

--datastore-conn-uri="postgresql://user:password@mydb:5432/spicedb"

), which doesn't accept the

plan_cache_mode

setting as parameter AFAIK, only these params https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS

williamdclt

05/01/2024, 9:30 AM

eg psql gives me an error https://cdn.discordapp.com/attachments/1234843902091984917/1235161503053647872/image.png?ex=66335d4a&is=66320bca&hm=73c52a56b2b8bb6707add74eb1cda973848dcc384efc720505ff3357376190f3&

Joey

05/01/2024, 3:40 PM

that's odd... that's how it worked before I thought

Joey

05/01/2024, 3:40 PM

isn't that how you set it originally?

williamdclt

05/01/2024, 4:47 PM

You're right it does work.

psql

can't deal with it but

pgx

(or whatever handles that) does

williamdclt

05/01/2024, 4:48 PM

the bad news is that the problem is still very much there 😢 not sure if it's exactly the same problem, seems like this time it's this query that causes a massive CPU consumption of the DB:

Copy code

sql
SELECT namespace, object_id, relation, userset_namespace, userset_object_id, userset_relation, caveat_name, caveat_context FROM relation_tuple WHERE pg_visible_in_snapshot(created_xid, $1) = $2 AND pg_visible_in_snapshot(deleted_xid, $3) = $4 AND namespace = $5 AND relation = $6 AND object_id IN ($7) AND ((userset_namespace = $8 AND userset_object_id IN ($9) AND userset_relation = $10)) LIMIT ?

https://cdn.discordapp.com/attachments/1234843902091984917/1235271657724317759/image.png?ex=6633c3e1&is=66327261&hm=39e40c3b27faeca285bdd0da0723363bac17c084dd24df3393fdc19e24743269&

williamdclt

05/01/2024, 4:48 PM

(it goes back to normal at ~17:40 because I rolled back)

Joey

05/01/2024, 4:57 PM

darn

Joey

05/01/2024, 4:57 PM

I was afraid of that

Joey

05/01/2024, 4:58 PM

the output above shows it helps for specific queries?

Joey

05/01/2024, 5:03 PM

wait... that query looks like the one above that you showed was helped by moving to

auto

Joey

05/01/2024, 5:03 PM

now I'm quite confused

vroldanbet

05/02/2024, 8:06 AM

that looks like a normal tuple query, not the same

min(), max()

query, which makes sense because all that change a while back after the MVCC for postgres was redesigned. If you set debug level, you should see the parameters used to execute the query, so you can replay it with an explain.

vroldanbet

05/02/2024, 8:10 AM

it could also be obtained for tracing, but it's disabled by default: https://github.com/authzed/spicedb/blob/5a7c1c2623543f7cfdbf912e91c3864d847918c2/internal/datastore/postgres/common/pgx.go#L193-L195. We could potentially turn it into a parameter.

williamdclt

05/02/2024, 10:26 AM

> wait... that query looks like the one above that you showed was helped by moving to auto It's the query I hoped would be helped by moving to auto 😓 the EXPLAIN suggested that it would, seems like it's wrong

28 Views

Previous Next