DAU used in the 1M QPS blogpost
# spicedb
s
Hi guys! I am trying to run load test similar to one described [in your blogpost](https://authzed.com/blog/google-scale-authorization). Just like in the blogpost, I came to a conclusion that DAU modelling is crucial if we want to measure spicedb performance and cache leverage. So I wonder what sample factor did you use for you tests? 10% or 1% or any other? p.s. it seems like there is a mistake with user id sequence for sample factor 1% (should be 0, 100, ... 900).
v
We ran various tests with different combinations of RPS and dataset size. For the larger datasets it typically was under 1%.
We also ran tests to validate that increasing the sampling rate was a matter of adding more compute to it
1% of 1000 is 10 or am I missing something?
s
> 1% of 1000 is 10 yes. doesn't that mean you'd have 10 users in a sample with IDs (0, 100, ..., 900) ?
v
you are right actually
so yeah an error in the post, well spotted
s
> We also ran tests to validate that increasing the sampling rate was a matter of adding more compute to it Currently I observe the following. SpiceDB v1.27.0 and the quantization window remained at the default 5 seconds, with a max staleness of 100% (same as in the blogpost). Given 1kk users exist in SpiceDB, I try to run tests with sampling factor 10%. i.e 100k users are in the sample, and I randomly pick userID from (0, 10, 20, .... 99990) to send CheckPermission. For uniform distribution and 1k RPS that would mean same user hits the cache approximately every 100 seconds while cache is reusable during approximately 10s (according to my settings). Effectively according to metrics that gives <2% dispatch cache hit rate. So my logic here is that I either have to increase RPS or lower my sample rate to match generated load. Is my logic correct here? I can't reach the same cache hit ratio as you did and I wonder that may be wrong.
v
That is reasonable, except that we randomly picked from the pool, not sure if that's the same you did. And yes, cache hit rate will increase with RPS, a very big dataset with log RPS will see low cache hit rate
please note that SpiceDB's cache is built for hot-spot caching, not a cache in the traditional sense. https://authzed.com/blog/hotspot-caching-in-google-zanzibar-and-spicedb I understand that folks want to optimize for higher cache hit rate to lower the latency, but at least with the current Zanzibar inspired architecture that can't be achieved without compromising security
Using
at-least-as-fresh
can also get you better cache hit rates, because you are telling SpiceDB you are ok with an older revision
in contrast
minimize_latency
has to compute a new revision each quantization window (plus staleness offset)
if you are ok with more staleness, you can always increase the quantization window
s
thanks for the confirmation! yeah, I've read that post as well 🙂 very well-written and clear unfortunately I can use
minimize_latency
now only, which effectively invalidates cache every 10s (for settings above).
v
is 10% of 100K a realistic DAU? is that something you identified in your application?
like 10% of your userbase active at all times?
s
10% of my userbase is active during an hour e.g 1kk is total amount of users in SpiceDB and I know that average daily active users are around 27.5% or 275k users in 24 hours. that means each hour I get approx. 11.5k unique users, which is approx 10% of total userbase. Do you suggest I should calculate per-minute or per-second DAU instead?
v
I think so. You are generating 1K RPS out of a 10K pool. That means it takes 10 seconds to have all users active (roughly). That seems different than 10% of Userbase active in an hour.
2 Views