I see some errors claiming that a
# spicedb
b
I see some errors claiming that a definition isn't found when it definitely exists (we would have major problems if
auth/organisation
didn't exist, for example)
Copy code
rpc error: code = Unknown desc = object definition `auth/organisation` not found
I see these for
authzed.api.v1.PermissionsService.CheckPermission
and
dispatch.v1.DispatchService.DispatchCheck
. I assume these errors are making their way back to the client, but I have our clients configured to retry on
Unknown
There's a tiny number of these compared to the number of incoming requests but it's kinda strange to see
y
what's your datastore and what's its topology?
j
are you using the watching namespace cache?
b
Aurora Postgres, 2 instances with SpiceDB configured to use reader as read replica. Using the default settings for NS cache (i.e. I haven't passed any
ns-cache-
args)
j
hrmph, ok
will see if we can repro somehow
b
I was wondering if "object definition not found" was even supposed to manifest as
UNKNOWN
but testing against a local instance querying a nonexistent definition I get a
FAILED_PRECONDITION
. Have some logs correlated on
requestID
and it looks like it may be dispatch related
Unfortunately even with retries enabled for
Unknown
responses sometimes it ends up bubbling up to the client, which would seem to indicate it happens repeatedly for a given check 🤔 With a configuration like this:
Copy code
csharp
new RetryPolicy
{
    MaxAttempts = 5,
    InitialBackoff = TimeSpan.FromMilliseconds(50),
    MaxBackoff = TimeSpan.FromSeconds(5),
    BackoffMultiplier = 4,
    RetryableStatusCodes = { StatusCode.Unavailable, StatusCode.Unknown }
}
Sometimes it does return
FailedPrecondition
... This was the only log entry for this requestID:
Copy code
json
{
    "message": "finished call",
    "grpc.component": "server",
    "grpc.service": "authzed.api.v1.PermissionsService",
    "grpc.method": "CheckPermission",
    "grpc.method_type": "unary",
    "peer.address": "10.0.2.112:53646",
    "grpc.start_time": "2025-03-28T00:34:38Z",
    "grpc.code": "FailedPrecondition",
    "grpc.time_ms": 2,
    "source": "stderr",
    "requestID": "cviut7hj2f0sco5qqur0",
    "protocol": "grpc",
    "grpc.error": "object definition `auth/user` not found",
    "time": "2025-03-28T00:34:38Z",
    "level": "warn"
}
j
FailedPrecondition will be returned if it is handled by the API layer
dispatch is another issue
it shouldn't be returned by the dispatch layer
that's considered an internal error
we haven't been able to repro, so we'll need more information
b
I have some additional info... it is somehow related to read replicas. After disabling the read replica connection the incidence of the error dropped to 0 If there's more info I can give you, please let me know what it is because I'm not sure what else I can share that I haven't already that would be useful. https://cdn.discordapp.com/attachments/1350995680151208056/1357119899268288553/image.png?ex=67ef0be6&is=67edba66&hm=7fe22fbd497a302be3eccf3003cb0738ec1f1313131ffa27114ba32cd4af8cba&
35 Views