Hi Folks, SpiceDB #spicedb

Hi Folks,

Manish

06/12/2024, 12:06 PM

Hi Folks, I am upgrading spicedb from version 1.13.0 to 1.14.0 manually not using operator. We are using postgres15 as datastore for spicedb. While upgrade getting this error `{"level":"error","error":"unable to migrate to

head

revision: error executing migration `add-xid-constraints`: ERROR: index \"ix_relation_tuple_living\" is not valid (SQLSTATE 55000)","time":"2024-06-12T11:46:14Z","message":"terminated with errors"}` Migration till

add_xi8_columns

is successfully completed but failed on

backfill-xid-add-indices

. Can someone help in figuring out the root issue for this. Thanks

vroldanbet

06/12/2024, 1:09 PM

have you checked if

ix_relation_tuple_living

index exists? It's created concurrently, so I could see it taking long to complete, and that could cause the migration to fail.

Manish

06/12/2024, 2:32 PM

> have you checked if ix_relation_tuple_living index exists? Yes On our cloudsql instance pg_cron is enable for partitioning, is it possible that migration is failing because of cron database flags.

vroldanbet

06/12/2024, 2:41 PM

We don't run postgres in GCP, so I honestly don't know if that matters or not

Manish

06/17/2024, 10:24 AM

Hi @vroldanbet We removed the database flags on cloudsql instance and did the upgrade once again. It is successful after that. But after the migration we are facing issue with the CheckPermission API, when we are creating any new resource and running CheckPermission on the same resource immediately, it is giving error. After around 15sec same CheckPermission is giving correct response. Check resources creation and the record corresponding the same resource is created immediately on spicedb. It is not taking time while creation but checkPermission API is able to verify resource permission after sometime. Any idea about this issue?

vroldanbet

06/17/2024, 10:27 AM

this is expected behaviour if your are issuing check permission calls with

minimize_latency

consistency, which is the default. If you need read-your-writes consistency, you need to use

at_least_as_fresh

with a zedtoken, of

fully_consistent

. Please note abusing

fully_consistent

will tank your SpiceDB performance. Please read more about SpiceDB Consistency guarantees in https://authzed.com/docs/spicedb/concepts/consistency

Manish

06/17/2024, 10:40 AM

Thanks @vroldanbet

Manish

06/24/2024, 10:43 AM

Hi @vroldanbet Is there a way to disable usage of cache for all APIs on Spicedb? We upgraded and performance got improved with that but at the same time we are facing issue with consistency. In our setup we can have some extent of latency but can't compromise on the consistency. We fix for the checkPermission API just via setting consistency to true and it worked like charm. But for LookupResource after adding consistency true our API calls are not returning proper data. Any suggestion around this. Because of this we got blocked in the upgrade to v1.19.0. Thanks in advance.

vroldanbet

06/24/2024, 12:07 PM

We've fixed a few issues, including one that affected LR. I recommend upgrading to the latest version, if once there if keeps reproducing, we can investigate.

Manish

06/24/2024, 1:29 PM

We are in the process for upgrading this but we can't take it one go, is there any way to make it more stable. Because as of now even when we added checkPermission API fully consistent it is returning true/false for same user on the same resource. If we are making any changes in the permission of the user. For ex- I have a team

spicedb-admin

user

xyz

is part of spicedb-admin team, this team have access to

abc

resource. When I remove

xyz

user from team. CheckPermission for

xyz

sometime returning false sometime it is returning true. When we are check relation_tuple with record it is removed immediately when we are removing this user from the team. We upgraded v1.13 to v1.14 after that we are facing this issue, till the point we upgrade it to latest version can we some changes in the configuration to make this more stable?

vroldanbet

06/24/2024, 3:00 PM

You are running an out of maintenance version that could be subject to bugs that have been fixed ever since, so the recommendation is to upgrade. Once you've upgrade, if you observe the bug persists, then the recommendation is to create a test case the reproduces the issue so we can investigate.

Manish

06/24/2024, 4:06 PM

Thanks

Manish

06/24/2024, 4:06 PM

Will try to upgrade to latest version.

Manish

06/27/2024, 12:22 PM

Hi We did an upgrade to v1.26.0 after the upgrade spicedb pods are restarting not able start because liveness and readiness check is failing for them. Pods are starting fine and spicedb server is also starting up but after it receives interrupt and start shutdown. In the logs just getting received

"message":"received interrupt"

after this pods are getting shutdown. Any suggestion on this to fix? https://cdn.discordapp.com/attachments/1250420897169543199/1255860851303383110/Screenshot_2024-06-27_at_5.52.16_PM.png?ex=667eab0f&is=667d598f&hm=7eef71e8fd93b10b301b9c253449a0c75977763f6754c5c7174b3c18dbf481b0&

Manish

06/27/2024, 12:34 PM

@vroldanbet any suggestion on this? We are postgres 15

vroldanbet

06/27/2024, 12:46 PM

I suggest running in debug mode. Nothing stands out as unhealthy.

Manish

06/27/2024, 1:06 PM

In debug logs also not getting any hint what is missing on this, by anychance you came across this type of case? https://cdn.discordapp.com/attachments/1250420897169543199/1255871932218212373/Screenshot_2024-06-27_at_6.34.36_PM.png?ex=667eb561&is=667d63e1&hm=d419e0a7960e85ad1beb5ec49a9793c764af95cff3a3b96902db6408b057b9fd&

Manish

06/27/2024, 1:17 PM

@vroldanbet while upgrading to v1.26..0 any specific config we need to add to our spicedb deployment?

vroldanbet

06/27/2024, 1:20 PM

The instance is closed ming up healthy and being terminated by kube. I'd say the problem is your health probes.

Manish

06/27/2024, 1:34 PM

Okay

Manish

06/27/2024, 8:15 PM

Thanks @vroldanbet The issue was with the port, previously we were using port 8080 but that port is not open in v1.26.0 , that is the reason of failure for readiness and liveness probes.

52 Views

Previous Next