Hey it s me again with more in depth
# spicedb
w
Hey, it's me again with more in-depth performance questions!
We were doing some testing with a large schema (lots of definitions, lots of inherited properties, total of about 250k of schema) and potentially a lot of data for the RDS size (around 400M relation tuples on a db.r6g.xlarge) and noticed some interesting behavior: if all of our requests are checkaccess or lookupsubjects, we get excellent cache utilization (>95%) and excellent performance (P99 < 10ms)- but if we include the occasional lookupresources in the mix (approx 10% of requests), it's a totally different picture: cache utilization close to 0% and nearly every request timing out, with our RDS instance totally overwhelmed. I'm curious if folks could help us understand how that might happen- especially why the cache utilization would be impacted so much.
j
the cache is shared
presumably, your LR is returning lots of data and pushing other stuff out of the cache
w
Okay. It's ultimately only returning one item, but there are some steps in the path where there would be a lot of items. If that could push other things out of the cache, that likely explains it.
j
yes
we cache the intermediate steps
I'd see if increasing the memory available helps
if it does, we can look into providing a flag for potentially breaking out those caches
y
related question: I know that
readRelationships
doesn't use the cache; does that mean it also doesn't populate the cache? i.e. for a single-hop lookup where you don't care to cache the results you should use
readRelationships
?
j
pretty much
w
Follow-up question: we're currently using spicedb_cache_cost_added_bytes and spicedb_cache_cost_evicted_bytes to track cache churn, but is there a metric we could tune to get to how much cache is currently in use?
are the defined metrics