Hello Team, In our spicedb setup we have SpiceDB #spicedb

Hello Team, In our spicedb setup we have

rohitlal.

11/27/2023, 9:37 AM

Hello Team, In our spicedb setup we have only 2 spicedb nodes no client or operator. We are observing spikes in memory,cpu and disk as well and it crashes some times only in one pod. This is being observed majorly when data push is happening in to db. we are using postgresql as our db. Can you please helpe us solve this issue. Please let me know if you need our yaml details.

vroldanbet

11/27/2023, 12:01 PM

please provide information about the deployment setup, the client workload and the version you are using

yetitwo

11/27/2023, 2:35 PM

this could be SpiceDB's garbage collection mechanism, especially if 1. you haven't modified the default spicedb settings 2. your data push does writes one-by-one and 3. your datastore is undersized

rohitlal.

11/28/2023, 4:58 AM

Deployment File Config. https://cdn.discordapp.com/attachments/1178630575595192352/1178922772622217286/message.txt?ex=6577e864&is=65657364&hm=4ed2fb6b69c52f1c280eeed0269dfba364aff361b4af50a891adbdd09d12220e&

rohitlal.

11/28/2023, 5:05 AM

what is client workload means?

rohitlal.

11/28/2023, 5:07 AM

Spice DB version: 1.16

rohitlal.

11/28/2023, 5:11 AM

Can you please suggest what settings we need to do which will solve this issue? Currently we are using default spice DB settings.

rohitlal.

11/28/2023, 6:31 AM

https://cdn.discordapp.com/attachments/1178630575595192352/1178946152167649341/image.png?ex=6577fe2a&is=6565892a&hm=2191303ff3ab8d745570c44affdb9647b03bbe7d5419ea5d3689a010cc8eb265&

vroldanbet

11/28/2023, 8:36 AM

what kind of requests are you doing to SpiceDB when it crashes

vroldanbet

11/28/2023, 8:36 AM

Please move to the most recent version 1.27.0

vroldanbet

11/28/2023, 8:38 AM

This looks OK. Are you sure dispatching is properly configured?

vroldanbet

11/28/2023, 8:39 AM

Disk seems unusual - make sure you don't have debug log level enabled

rohitlal.

11/28/2023, 1:56 PM

In Spice DB we are storing some permission related stuff which is around 80k data

rohitlal.

11/28/2023, 1:56 PM

sure

rohitlal.

11/28/2023, 1:56 PM

How we can check dispatching thing?

vroldanbet

11/28/2023, 2:28 PM

you can check that the SpiceDB DispatchCheck API is being called

rohitlal.

11/28/2023, 3:25 PM

but how we can configure this properly?

vroldanbet

11/28/2023, 3:30 PM

it should be something like this

Copy code

- name: SPICEDB_DISPATCH_UPSTREAM_ADDR
  value: kubernetes:///<your_service_name>.<your_service_kube_namespace>:<your_service_dispatch_port_name>

yetitwo

11/28/2023, 3:50 PM

are you making the writes in batches or are you making those writes individually?

rohitlal.

11/29/2023, 4:49 AM

From kafka we are pushing the data one by one into spice DB.

rohitlal.

11/29/2023, 5:00 AM

Thanks can you please help me where it needs to be configure?

rohitlal.

11/29/2023, 5:06 AM

Can you please confirm what is spiceDB setting we need to configure to fix this issue?

rohitlal.

11/29/2023, 6:34 AM

Do you recommend to use batching to push the data into spiceDB?

vroldanbet

11/29/2023, 3:03 PM

this would be in your kubernetes Deployment. This is why we recommend using the operator so you don't have to deal with these bits and automate the upgrades - we strongly recommend moving to the operator

yetitwo

11/29/2023, 3:58 PM

that's likely a problem. are you using postgres as your datastore?

yetitwo

11/29/2023, 3:59 PM

we ran into an issue where pushing updates one-by-one through kafka and it created a new postgres snapshot for each update. by default those snapshots are retained for 24hrs and then garbage collected, but if the garbage collection can't complete within its timeout, it will thrash, which will cause high CPU usage on your datastore without any traffic on it

yetitwo

11/29/2023, 4:01 PM

we fixed the issue by reducing the snapshot retention to 1hr (the window only needs to be as large as the maximum window in which you want to call

at_exact_snapshot

consistency) and making our kafka consumer do batch updates

yetitwo

11/29/2023, 4:01 PM

upsizing our database also helped

vroldanbet

11/29/2023, 6:12 PM

that is fixed in 1.27.0 @yetitwo - GC is now way faster

vroldanbet

11/29/2023, 6:13 PM

the problem is that the queries were very inefficient. The GC should be able to keep up, as once it encounters garbage, it should iterate fast over it

vroldanbet

11/29/2023, 6:14 PM

We added a new index to it. The problem is that new index requires Postgres 15 to work, because the query planner would refuse to select it in earlier versious - due to the xid8 type, which was relatively new to PG

vroldanbet

11/29/2023, 6:19 PM

but at least we didn't observe this causing OOMKills - it saturated RDS and made SpiceDB slow overall

rohitlal.

11/30/2023, 5:25 AM

Thank you let us try upgrade the spiceDB version i hope that will solve this CPU memory issue.

rohitlal.

11/30/2023, 5:29 AM

Can you please suggest which version of postres we need to use as Datastore to fix this issue?

rohitlal.

11/30/2023, 7:45 AM

If we are upgrading spiceDB to 1.27.0 then what should the postgresql db version we need to use?

vroldanbet

11/30/2023, 8:45 AM

PG 15

rohitlal.

11/30/2023, 9:34 AM

Do we have any document of SpiceDB where its mentioned that to use PG 15 with SpiceDB Latest version?

rohitlal.

11/30/2023, 9:35 AM

Can we use Latest version PG16?

vroldanbet

11/30/2023, 10:14 AM

The minimum supported version is PG 13.8. SpiceDB is being tested against 13.7, 14, 15, 16. There should be documentation in the public docs about the minimum required version,.but not about the recommended PG15

rohitlal.

11/30/2023, 2:10 PM

What my question is to fix this spiceDB Spike issue We need to upgrade the version to 1.27 and we are using PG 14.7 so do we need to upgrade PG version to 15 or 16 to fix that Garbage collection issue?

vroldanbet

11/30/2023, 2:28 PM

Im not sure your issue is related with the GC inefficiency fixed in 1.27. What I would definitely suggest is to update because there have been many many perf improvements since 1.16. You don't need to update to PG15, it's not strictly necessary to run 1.27, but if indeed the problem was related to GC, then you need to at least run PG15 to have it fixed

rohitlal.

12/01/2023, 6:41 AM

Ok. Thank You. Let us try update spiceDB and upgrade PG to 15 then i will update you.

vroldanbet

12/01/2023, 9:50 AM

1.28 was just released - use that

rohitlal.

12/01/2023, 10:07 AM

Sure. Thanks !!

4 Views

Previous Next